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A MICROGENETIC APPROACH TO WORD ASSOCIATION! 


JOHN H. FLAVELL, JURIS DRAGUNS, LEONARD D. FEINBERG, anp WILLIAM BUDIN 
University of Rochester 


(2) have sketched the history of the 

concept of microgenesis—the series of 
events presumed to occur in the course of a 
single, brief conceptual or perceptual act—and 
have attempted to derive, as first steps in 
the formulation of a microdevelopmental 
approach, certain tentative propositions about 
such transient evolutions. In the particular 
case of word association, theoretical considera- 
tions suggest that there are covert word 
responses that push for expression early in the 
associative process and that these early micro- 
genetic forms are less logical, more “paleo- 
logical” (1) in character than later ones. It is 
further assumed that when normal individuals 
associate in a relatively unhurried fashion, 
these “immature” responses are usually sup- 
pressed, the microgenetically later, more 
logical associations being the ones actually 
spoken. If these assumptions are correct, the 
likelihood of such immature responses being 
given overtly by normal individuals should 
increase under conditions that force very 
rapid response. Another current working 
hypothesis of the microgenetic approach, 
following Schilder (7), suggests that schizo- 
phrenic thinking can be characterized as 
microgenetically immature cognition. That is, 
schizophrenics tend to give overt expression 
as a final product to early cognitions that 
normals usually suppress in favor of the later, 
more mature cognitive responses. 

It is the primary purpose of the study to 
make a preliminary test of two broad hypoth- 
eses derived from these speculations. First, 
normal subjects instructed to associate very 
quickly should give more microgenetically 
immature responses and fewer mature ones 
in word association than should comparable 
normal subjects performing without extreme 
time pressure. Second, schizophrenics should 
likewise give more immature and fewer mature 
associations than comparable normals, under 
conditions that do not involve pressure to 
respond immediately. The first two experi- 


[ a recent paper, Flavell and Draguns 


1 The authors wish to thank Eugene Sachs for his 
assistance in collecting a portion of the data. 


ments to be described constitute attempts 
to explore the first hypothesis; the third 
experiment bears upon the second. 


EXPERIMENT I 


Method 


Eighty-four university undergraduates served as 
Ss: 52 women and 32 men. The apparatus consisted of 
a .01 sec. Standard Electric Timer, started and stopped 
by a hand key, and a Selmer Metronoma electric 
metronome. 

All 84 Ss were first given a 61-item word association 
test and all protocols were scored in accordance with 
the scoring system to be described in the next section.* 
The Ss were then divided into three groups of 28 each 
—Slow, Fast, and Fast-Distraction—roughly equated 
for sex and for mean number of responses scored 
“microgenetically immature” on this pretest word list. 

A second association test of 79 common words was 
then administered individually to each of the 84 Ss. 
The first three authors served as Es, each taking a 
roughly equal share in testing of Ss of each sex within 
each group. The Ss were blindfolded throughout the 
test in order to avoid distraction. EZ read aloud instruc- 
tions and stimulus words and recorded reaction times 
to the nearest .01 sec. by means of the hand key and 
electric timer. The key was depressed when EZ began to 
say the stimulus word and was released when S started 
to say the response word. Slow Ss were instructed to 
respond to each stimulus word with the first word 
which came to mind. However, they were also told 
not to rush or hurry the association process; rather, 
they were instructed to sit back, relax, and wait for the 
first association to come to mind. Reaction times were 
to be taken, they were told, but only “for the record.” 
The instructions for the Fast group, on the other hand, 
stressed repeatedly the paramount importance of re- 
sponding as quickly as possible. These Ss were told 
that, above all, they were to respond with a word just 
as quickly as they possibly could. Members of the 
third, Fast-Distraction group* were given speed-oriented 
instructions as were the Fast Ss but, in addition, had 





2 Word association lists, instructions to Ss, and full 
scoring instructions for association categories have 
been deposited with the American Documentation 
Institute. Order Document No. 5461, remitting $1.25 
for 35-mm. microfilm or $1.25 for 6 by 8 in. photo- 
copies. Make checks payable to Chief, Photoduplica- 
tion Service, Library of Congress. 

3 The purpose of including the Fast-Distraction group 
in the study was twofold. First, it constituted an 
additional “time-pressure” group with which to com- 
pare the Slow group. Second, it provided an opportunity 
to find out whether time pressure coupled with dis- 
traction yielded more immature responses than time 
pressure alone (Fast group). 
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also the distracting task of making 2 pencil stroke at 
every third beat of a metronome set to click twice per 
second throughout the administration of the test. Both 
Fast and Fast-Distraction Ss were from time to time 
urged by £ to increase their speed of response in the 
course of presenting the first 13 stimulus words. These 
13 words were thus treated as “practice words” and 
only the responses to the remaining 66 words were 
used in the statistical analyses. 


Scoring 


A number of investigators have described scoring 
systems designed to assess the formal relationships be- 
tween stimulus and response words (3, 4, 6, 9). From 
these systems, the writers selected 16 categories which, 
with modifications, seemed to permit relatively easy 
and unambiguous scoring. These word association 
categories were classified on theoretical grounds into 
three types: first, those presumed to reflect micro- 
genetically early or immature cognition; second, those 
presumed to reflect cognition in the later, final phases 
of microdevelopment; third, certain miscellaneous 
categories of indeterminate microgenetic phase that 
had been used in previous word association studies and 
appeared to deserve further empirical study. A brief 
description of the 16 categories follows. 

A. Immature Categories: 

1. Completion. The response word completes a word 
or common phrase of which the stimulus word is a part 
(needle—“haystack,” glow—“worm”’). 

2. Distant. The response word is either unrelated or 
distantly related in meaning to the stimulus word, or 
else is a decidedly idiosyncratic, personal association 
which may possess a moderate semantic relationship 
to the stimulus word (love—‘‘Mary,” book—“dark- 
ness”). 

3. Perseveration. The response word is unrelated or 
only distantly related to its stimulus word but is either 
obviously related to the stimulus or response word just 
preceding it in the series or else is identical to some 
stimulus or response word given earlier in the series. 

4. Perseveration-meaningful. The response word is 
identical to some stimulus or response word given 
earlier but is also closely related to its stimulus word. 

5. Clang. The response word seems to have been 
primarily achieved by means of a process of association 
based upon physical rather than semantic relationships 
(cold—“gold,” glow—“‘fast’”*). 

6. Emotional. The response word clearly connotes an 
affective, evaluative judgment by S with respect to the 
referent of the stimulus word (love—“good,” snake— 
“ugh!’) 

7. Repetition. The response word is a repetition or a 
partial repetition of the stimulus word, or is a gram- 
matical variant of the stimulus word (expectation— 
“expect,” deep—“depth”’). 

8. Multi-word-discrele. The S gives two or more 
response words to a single stimulus word and these 
response words denote different or discrete concepts 
rather than constituting a multi-word expression of a 
single concept (jewel—“ruby,” then “diamond”’). 

B. Mature Categories: 





* When S has given “fast” in response to “slow” just 
previously, the presumption being that “glow”’ has here 
been assimilated to “go.” 


1. Synonym. The response word is an almost exact 
synonym of the stimulus word (liberty—‘freedom”’). 

2. Supraordinate. The response word denotes a class 
of which the stimulus word’s referent is a class member 
(egg—‘‘food””’). 

3. Subordinate. The response word denotes a member 
of the class signified by the stimulus word (fruit— 
“apple”’). 

C. Indeterminate Categories: 

1. Attribute. The response word is an adjective and 
the stimulus word a noun or vice-versa, providing only 
that the adjective could actually modify the noun in 
question (deep—‘‘ocean,”’ but not, bed—“sleepy’’). 

2. Verb. The response word is a verb and the stimulus 
word a noun or vice-versa, providing only that the 
action denoted by the verb could actually be carried 
out by, or upon, the object signified by the noun 
(snore—“father,” cheese—‘“eat’’). 

3. Contrast-coordinate. The response word is either 
an antonym of the stimulus word or a coordinate, com- 
plimentary term on about the same level of abstrac- 
tion (dark—“light,” salt—“pepper,” wall—“ceiling’’). 

4. Multi-word. The S gives two or more response 
words to a single stimulus word but these response 
words all express a single concept (citizen—“a person 
who pays taxes’’). 

5. Blocking. A response is either not offered at all or 
else has a latency greater than 10 seconds. 
Rationale 

The rationale for assigning particular categories to 
early versus late microdevelopmental phases was 
briefly this. Synonym, Supraordinate, and Subordinate 
are thought to reflect late phases because of their 
obvious logical, “secondary process” character. Clang 
and Repetition responses are believed to give evidence 
of a very beginning associative sequence, one which 
has not progressed beyond mere apprehension of the 
physical characteristics of the stimulus word. Persev- 
eration, Perseveration-meaningful, and Completion sug- 
gest a contextual rather than strictly logical connection 
between the stimulus word and either another word 
with which it is frequently and mechanically paired in 
everyday life (Completion) or some word still echoing 
from a previous association (Perseveration and Persev- 
eration-meaningfui). Similarly, it is early rather than 
late in thought development that mediated and affect 
driven associations occur (Distant), that a number of 
discrete associative directions may exist without 
resolution into a single direction (Multi-word-discrete), 
and that primitive, dichotomous judgments of a 
“like-—dislike” or ‘“good—bad” character prevail 
(Emotional) .§ 


5 For a more detailed account of the assumed char- 
acteristics of early versus late developmental phases, 
see Flavell and Draguns (2). It might be mentioned 
here, however, that the present assignment of response 
categories to one or another phase stems from the 
authors’ own (and perhaps idiosyncratic) integration 
of a variety of existing statements about the micro- 
developmental process. Hence, the present interpreta- 
tion of any given category may differ in some particulars 
from any single previous interpretation of similar cate- 
gories. As an example, the Distant category is both 
defined and interpreted somewhat differently from the 
response class of the same name established by Rapa- 
port, Gill, and Schafer (6). 
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Scoring Conventions 


In using the scoring system. several conventions 
were adopted. Following Rapaport, Gill, and Schafer 
(6), the system permits multiple scoring of a single 
response. Two limitations were imposed, however. 
First, if a response could be scored for both an imma- 
ture and a mature or indeterminate category, it was 
scored for the immature category only. For example, if 
a response word were a synonym for a stimulus word 
but had also occurred earlier in the series, it was scored 
Perseveration-meaningful, not Synonym. Also, a re- 
sponse was not scored for both of two categories if the 
two categories were not independent by definition. 
For instance, a response scored Perseveration would by 
definition also be a Distant response and would there- 
fore be scored only for the more specific category, 
that is, Perseveration. It was found that about 80 per 
cent of all responses given received at least one scoring 
in this system. It should be emphasized that despite 
strenuous attempts to rid it of scoring ambiguities, the 
present scoring system still possesses numerous imper- 
fections. Thus, it was sometimes difficult to decide 
between Distant and not Distant, Perseveration and 
Perseveration-meaningful, Completion and Contrast- 
coordinate, and Emotional and Attribute, to take but a 
few examples. In order to get some estimate of the 
interjudge reliability of the scoring system, 11 ran- 
domly chosen protocols were independently scored by 
two of the Zs. There was agreement on 589, or 80 per 
cent of the 733 scorings made. This level of agreement, 
although not as high as would be desirable, compares 
favorably with the average agreement of about 67 per 
cent achieved by Karwoski and Berthold (4) using half 
as many scoring categories. Recognizing that interpre- 
tative biases of the scorer must inevitably enter into 
the scoring, it was decided that only one E (the senior 
author) should score all protocols in order to make the 
bias more or less constant rather than variable across 
groups.* Further, each Z conducted a brief inquiry, 
following administration of the test, on those responses 





* It is recognized that this procedure does not in it- 
self guard against the kind of bias effects wherein the 
scorer, being at the same time an Z, may unconsciously 
resolve difficult scoring problems in favor of his hy- 
potheses in cases where the group identity of the pro- 
tocol is known to him—or, conversely, “lean over 
backwards” to err systematically against his hypothe- 
ses. In the present study, the scorer—as he scored 
—unfortunately knew the group identity of all the 
protocols in Experiment II and a few of the protocols in 
Experiments I and III. To be sure, certain precautions 
were taken. In cases where the group identity was 
known, the scorer resolved really borderline scoring 
problems conservatively, i.e., against his hypothesis; in 
Experiment II the scorer compared each S’s Fast- 
Reward protocol to his previous Slow one and auto- 
matically scored identical responses in each protocol 
identically; and throughout, the scorer continuously 
checked back to previous protocols to make sure his 
scoring standards remained as constant as possible. 
These precautions notwithstanding, it is important to 
underscore the fact that the scoring procedure used 
does render all findings in this study less definitive 
than would have been the case had an experimentally 
naive scorer been used. 


TABLE 1 


Worp AssociATION RESPONSE LATENCY FOR SLOW, 
FAST, AND FAST-DISTRACTION Groups 


(Time in Seconds) 








Fast- 


Slow Fast Distraction 





2.27 1. 
0.66 0. 


60 1.55 
23 0.26 





TABLE 2 


CaTecory SCORES FOR SLOW, FAST, AND 
FAST-DISTRACTION Groups 








Fast _ Past. 





Mean 


Mean | SD 


Mean 
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| 





4.75 
2.3% 
1.46 
6.39 
0.43 
0.75 
1.68 
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which he felt might raise difficult scoring problems 
later. 


Results 


Table 1 shows the reaction time data for 
the three groups. As was expected, both Fast 
and Fast-Disiraction groups showed signifi- 
cantly briefer mean latencies (P < .01) than 
the Slow group (¢ = 5.04 and 5.38, respec- 
tive!:’).7 The differences in mean latency be- 
tween Fast and Fast-Distraction groups were 
not statistically significant. 

The actual word association response data 
were analyzed by means of a three-way chi- 
square analysis of variance recently described 
by Wilson (10). Separate analyses on each of 
the 16 categories were performed for group 
(three) times sex (two) times experimenter 
(three). None of the main effects for sex and 
experimenter proved statistically significant, 
although there were significant but uninter- 


7 All P values reported in this paper are for the two- 
tailed test. 
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pretable sex-experimenter interactions (P 
< .05) for two categories—Completion and 
Repetition. With regard to main effects for 
group, it was predicted that the Fast and 
Fast-Disiraction groups would exceed the 
Slow group in frequency of immature responses 
and the converse for mature responses. Only 
two main effects for group were actually found 
to be statistically reliable: Perseveration 
(P < .001) and Distant (P < .01). Individual 
chi-square analyses for these two categories 
yielded the following findings: first, for Per- 
severation, Fast greater than Slow (P < .001) 
and Fast-Distraction greater than Slow (P 
< .10); second, for Distant, Fast greater than 
Slow (P < .01) and Fast-Distraction greater 
than Slow (P < .01). Table 2 shows Means 
and SDs, by category, for Slow, Fast and 
Fast-Distraction groups. 

In summary, there was no evidence that 
Fast and Fast-Distraction conditions had differ- 
ing effects on word association responses as 
scored in this experiment. None of the five 
indeterminate categories showed significant 
differences among the three groups. Of the 
eleven categories about which predictions 
were made, two showed significant Slow-Fast 
and Slow-Fast-Distraction differences. 


EXPERIMENT IIT 


In an attempt to clarify the rather ambigu- 
ous findings described above, a second experi- 
ment was conducted. Eighteen of the original 
28 Slow group Ss were retested individually on 
the same list and by the same k's 8-12 months 
after the original testing. In the hopes of 
effecting even briefer response times than were 
obtained under Fast and Fast-Distraction 
conditions, these Ss were paid five cents for 
every response made within 1.20 seconds. So 
that the Ss could pace themselves, the reaction 
time to each response was read aloud by E 
immediately after the response was given. 
Further, S also heard a click following each 
response made within the 1.20 second limit. 
As before, S was blindfolded and was given 
the usual 13 practice words prior to the test 
proper. Thus, this Fast-Reward group con- 
stituted 18 Slow Ss retested under essentially 
Fast instructions with money serving as a 
reward for quick responding. As before, all 
responses were scored by the senior author.* 


8 The scoring was identical to that of the previous 
experiment with one exception: because of certain 
scoring difficulties peculiar to Multi-word-discrete, this 
category was dropped from the classification system. 


As was expected, Fast-Reward instructions 
led to significantly briefer latencies than did 
Slow instructions. In fact, every one of the 
18 Ss decreased in mean reaction time from 
Slow to Fast-Reward conditions, the group 
Mean and SD for the Fast-Reward latencies 
being 1.44 seconds and 0.11 seconds versus 
2.31 seconds and 0.18 seconds for the same 
Ss under Slow conditions. Contrary to expecta- 
tion, however, response times under Fast- 
Reward conditions were not significantly 
briefer than the Fast and Fast-Distraction 
latencies (Table 1). 

Word association responses under Slow 
and Fast-Reward conditions were compared, 
for each category, by means of the Wilcoxon 
Signed Ranks technique for paired replicates 
(5). In accord with prediction, there were 
significantly more Perseveration, Perseveration- 
meaningful, and Clang responses (P < .01) 
and significantly fewer Subordinate responses 
(P < O01) under Fast-Reward instructions 
than under Slow instructions. Of the indeter- 
minate categories, Attribute showed a signifi- 
cant decrease in frequency from Slow to Fast- 
Reward conditions (P < .01). Table 3 shows 
Means and SDs for each category under test 
(Slow) and retest (Fast-Reward) conditions. 


EXPERIMENT III 


A third experiment was conducted in order 
to find out whether a group of schizophrenics 
would differ from a matched group of normals 
in the same way, with regard to word associa- 
tion responses, as normals under time pressure 
differ from normals without time pressure. 

The Ss for this study were schizophrenics 
and hospital aides from a Veterans Adminis- 
tration neuropsychiatric hospital. The patient 
group was a fairly typical sample of the 
testable members of this hospital’$ schizo- 
phrenic population. That is, most but not all 
of the sample were chronic patients of several 
years hospitalization; some were nearly free 
from overt symptoms and some were obviously 
psychotic. All patients were free from known 
organic impairment and none was undergoing 
electric shock treatment at the time of testing. 
The groups were matched, on a man-to-man 
basis, for age and verbal intelligence as esti- 
mated by the Vocabulary subtest of the Wech- 
sler Adult Intelligence Scale. Group Means 
and SDs are reported in Table 4, none of the 
group differences approaching statistical relia- 
bility. ‘The groups were not matched for edu- 
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TABLE 3 


Catecory Scores ror 18 Ss TESTED UNDER SLOW 
INSTRUCTIONS AND RETESTED UNDER 
PAST-REWARD INSTRUCTIONS 








Slow Fast-Reward 
SD 


Category Mean SD Mean 





Immature* 
Completion 
Distant 
Perseveration 
Perseveration-meaning ful 
Clang 
Emotional 
Repetition 

Mature 
Synonym 
Supraordinate 
Subordinate 

'ndeterminate 
Attribute 
Verb 
Contrast-coordinate 
Multi-word 
Blocking 


4.78 
0.39 
1.22 


3.44 
3.22 
15.39 
1.22 
0.22 


1.10 
0.98 





® The category Multi-word-discrete was not scored in this ex- 
periment. See Footnote 8. 


TABLE 4 
CHARACTERISTICS OF NORMAL AND 
ScHIZOPHRENIC GROUPS 


lies in Years Verbal IQ 








Grade 
| Completed 
Group . 





Mean! SD | Mean| SD | Mean| SD 





2.13 


08.50) 2.43 


a0 12.98/101.95) 13.52) 10.65 


Normal (V = 20) 


Schizophrenic (N = 20) | 36.30) 11.57/100.60) 12.55 
' ! 





TABLE 5 


CaTecory Scores FOR NORMAL AND 
SCHIZOPHRENIC GROUPS 








Normal Schizophrenic 
Category Mean SD Mean SD 





Immature 
Com pletion 
Distant 
Perseveration 
Perseveration-meaning ful 
Clang 
Emotional 
Repetition 

Mature 
Synonym 
Supraordinate 
Subordinate 

Indeterminate 
Attribute 
Verb 
Contrast-coordinate 
Multi-word 
Blocking 





cation, although group differences on this 
variable similarly turned out to be far from 
statistical significance. The word association 
test was administered individually to the Ss 
of both groups by the fourth author. The in- 
structions for the two groups were in all 
essentials identical to those previously given to 
the Slow group Ss in Experiment I, i.e., the 
S was told to give his first associated word 
but not to feel rushed or hurried, etc. As was 
the case with the previous groups, the senior 
author scored all protocols. 

The Wilcoxon Signed Ranks technique for 
paired replicates was again used to assess group 
differences for each category. Significant 
differences in the predicted direction were 
found for Distant (P < .02), Perseveration 
(P < .01), Clang (P < .05), Synonym (P < .05), 
and Subordinate (P < .05). No other group 
differences attained statistical reliability. Table 
5 shows Means and SDs for each category by 
group. 

In order to assess the possible effects of age, 
education, cultural background etc. on word 
association as scored in the present system, 
a second, incidental analysis was performed. 
The 12 Slow group males were compared with 
the 20 hospital aides on each category by means 
of the Mann-Whitney “U” Test for un- 
matched groups (5). The student group showed 
significantly more Completion (P < .01), 
Perseveration (P < .01), Perseveration-meaning- 
ful (P < .05), and significantly fewer Blocking 
(P < .01) and Supraordinate (P < 01). 
These findings, although possibly suggestive 
of further word association research within 
the framework of differential psychology, 
were in no sense predicted here and have no 
theoretical status in the present experiment. 


DISCUSSION 


The findings of these three experiments can 
be approached from two points of view: (a) as 
empirical information about the word associa- 
tion process in various groups and under 
different conditions of administration; (6) as 
findings relevant to the alleged microgenesis 
of word association. Considered from the first 
point of view, some of our data are relevant 
to previous association studies. Thus, Siipola, 
Walker, and Kolb (8) recently described an 
experiment comparing the word association 
responses of groups tested under free versus 
time pressure conditions. They found that 
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noun responses to adjective stimulus words 
significantly decreased under time pressure 
conditions. This result is paralleled by the 
decrease in Aliribute responses under Fast- 
Reward conditions in the present study 
(Experiment II). Siipola e¢ al. also found, 
however, that “contrast” responses signifi- 
cantly increased under pressure instructions. 
The present study, on the other hand, failed to 
find any significant group differences for 


Contrast, although the category presumably — 


had somewhat different content. Similarly, 
Rapaport, Gill, and Schafer (6) compared 
the word associations of normals and schizo- 
phrenics using a number of scoring categories, 
some of which are fairly comparable to those 
used in the present investigation. They too 
found a significantly higher frequency of 
categories roughly equivalent to Distant, 
Perseveration, and Clang in the schizophrenic 
group than in the normal group. Contrary to 
our own findings, however, Rapaport ef al. 
also report significant group differences for the 
near-equivalents of categories Repetition and 
Completion. It is of course especially hazardous 
to try to interpret the divergent results of two 
such studies, in view of the lack of complete 
identity of the categories on which findings are 
compared. 

More important within the present frame- 
work, however, is the possible significance of 
the results for a microgenetic view of word 
association. It will be recalled that the only 
categories for which predicted group differ- 
ences were confirmed were the following: 
(a) Experiment I—Perseveration and Distant; 
(6) Experiment Il—Perseveration, Persevera- 
tion-meaningful, Clang, and Subordinate; (c) 
Experiment [1]—Perseveration, Distant, Clang, 
and Subordinate. Perhaps a parsimonious 
interpretation would be as follows. Insofar 
as the findings are reliable (see again Footnote 
6), a small group of categories—including 
Perseveration and perhaps Distant, Clang, and 
Subordinate—seems best to express word 
association response tendencies common to 
schizophrenics and to normals under time 
pressure. If the further interpretation is 
adopted that this common core of categories 
reflects a common attained level of micro- 
development in the associative process, then 
it seems very likely that the original hypoth- 
esis, involving a larger number of categories, 
was too broad and nonspecific. That is, it may 





well be that microdevelopmental immaturity, 
at least insofar as word association is concerned, 
is most accurately characterized by some but 
not all of the attributes that at one time or 
another have been assigned to it. 

Such a view of the present results suggests 
several broad lines of future investigation. 
First of all, further attempts could be made to 
verify the present findings within the area of 
word association, with the added intent of 
specifying more closely the cognitive processes 
that lie behind the specific responses scored as 
Perseveration, Clang etc. Second, one could 
endeavor to develop methods of investigating 
microgenetically immature cognitive responses 
in different types of tasks, for example, the 
completing of incomplete sentences under time 
pressure. In such a study it would be of para- 
mount interest to find out whether or not it 
would be precisely those categories analogous 
to Perseveration, Clang etc. which increase in 
frequency under pressure conditions. Further, 
it would be of great interest to determine 
whether this same response cluster emerges 
in normals under debilitating conditions other 
than time pressure, e.g., under the influence 
of various kinds of pharmaceutical agents. 

Finally, it is important to add that, in our 
view, microgenetic concepts seem at present 
to constitute a somewhat specialized develop- 
mental approach to cognitive phenomena 
rather than a cognitive “theory” in any strict 
sense of the term. For example, the micro- 
developmental formulation is hardly rigorous 
enough in its present form to rule out alterna- 
tive explanations of the present findings cast 
in terms of the rigidifying effects of time 
pressure, the inability of schizophrenics to 
maintain a constant word association set, 
and the like. As an approach therefore, its chief 
value may ultimately lie in whatever fecundity 
it may possess in the way of generating fresh 
and novel experimentation on familiar, com- 
monplace phenomena. 


SUMMARY 


The present study constitutes an attempt 
to test certain general hypotheses derived from 
a microgenetic approach to word association. 
In the first two experiments, using college 
students as subjects, association responses 
given under time pressure conditions were 
compared with those given without time pres- 
sure. In the third experiment, word associa- 





tions of a group of schizophrenics and a 
matched group of hospital aides were similarly 
compared, both groups in this case responding 
without time pressure. It was predicted that 
the word associations of the college students 
performing under time pressure would differ 
from those of the college students responding 
under free conditions in exactly the same way 
as the responses of the schizophrenics would 
differ from those of the aides. Further, the 
precise character of the group differences in 
question was specified in the hypotheses. These 
predictions were in part supported by the 
results. That is, several of the predicted group 
differences were confirmed and a partial con- 
gruence between student-student differences 
and schizophrenic-aide differences was found. 
An interpretation of these findings was sug- 
gested and possible directions for further 
research within a microgenetic ovientation 
were specified. 
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IMPROVEMENT IN THE PERFORMANCE OF SCHIZOPHRENICS 
ON CONCEPT FORMATION TASKS AS A FUNCTION OF 
MOTIVATIONAL CHANGE! 


DAVID K. CAVANAUGH 
University of Buffalo 


formation tasks has been demonstrated 

repeatedly as a characteristic of schizo- 
phrenia (1, 2, 12, 19, 20). A variety of inter- 
pretations have been offered as to the origin 
and nature of this impairment. 

A group of theorists typified by Goldstein 
stress the similarity between the performance 
of schizophrenic and organic patients. Gold- 
stein indicates that the similarities in the 
behavior of organics and schizophrenics points 
to an organic process (perhaps of secondary 
origin) as cause of schizophrenic impairment 
(9, 10, 11). Another group of theorists typified 
by Cameron (3, 4, 5, 6) suggest that the 
personality disorganization of the schizo- 
phrenic is basic to the latter’s decrement in 
conceptualization. Thus, a high degree of 
abstracting ability may be present in the 
schizophrenic and be confounded by the severe 
personality disorientation. For Cameron this 
represents the product of the schizophrenic’s 
“social disarticulation.” 

Whiteman (22) notes that from Cameron’s 
theory one can predict a differential deficit 
with schizophrenics on socially toned as op- 
posed to formally toned concepts. Goldstein, 
on the other hand, would seem to suggest a 
generalized deficit with no differential decre- 
ment as a function of social test content. 
Whiteman reports findings which he interprets 
as “lending presumptive support to a theoreti- 
cal position which also stresses the importance 
of social withdrawal as a determinant of 
cognitive functioning in schizophrenia” (22, 
p. 271). 

Another group of theorists emphasize the 


[om of performance on concept 


1 This report is adapted from a thesis submitted in 
partial fulfillment of the requirements for the Ph.D. 
degree at The University of Buffalo. The author is 
indebted to Walter Cohen, under whose guidance the 
study was carried out, and to Ira Cohen for his advice 
and assistance. Grateful acknowledgment is also made 
of the criticisms and suggestions of others of the 
psychology faculty of The University of Buffalo. The 
writer is grateful to the professional staffs of the VA 
hospitals at Buffalo and Canandaigua, N. Y., for their 
generosity and interest in supplying the time and 
facilities necessary for the completion of the study. 


distinction between performance and potential 
ability with schizophrenics. Shakow (14, 18) 
distinguishes between schizophrenic learning 
ability and capacity. Hunt and Cofer (13) 
conceive of psychological deficit in schizo- 
phrenia as a concomitant of mental illness. 
They describe behavior as a product of the 
stresses faced by an individual and his abilities 
to handle these stresses. In schizophrenia, the 
patient’s limited ability to handle stress leads 
to such abnormal behavior as withdrawal. The 
addition of a new stress, more potent than the 
stresses which have led to schizophrenia, may 
at least temporarily reverse the trend toward 
inappropriate behavior and result in actions 
more similar to those of normals. The psycho- 
logical deficit of schizophrenics is thus per- 
ceived as a performance variable rather than 
an accurate indicant of the patient’s present 
capacity. Pascal and Swenson (17) report 
results consistent with the position of Hunt 
and Cofer. Using an escape from aversive 
stimuli design, Pascal demonstrated schizo- 
phrenic performance at near the level of 
normals on a complex discrimination reaction 
time type of learning problem. Cohen (7) used 
an escape from aversive stimuli design on a task 
involving the learning of a series of motor 
responses to successively presented visual 
stimuli. His results indicated facilitation for 
the more motivated schizophrenics as com- 
pared to patients under mere rapport condi- 
tions. 

The purpose of the present study was to 
investigate the role of motivation in the 
performance of schizophrenics on concept 
formation tasks. Two questions were asked: 
Does escape from aversive stimuli decrease 
impairment in concept formation with schizo- 
phrenics? And are there differences in the 
formation of social as opposed to formal con- 
cepts under this condition of increased motiva- 
tion? 

METHOD 
Subjects 


Hospitalized male veterans were used as Ss in the 
study. The 36 normal Ss were obtained from the 
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medical and surgical wards of the VA General Medical 
and Surgical Hospital at Buffalo, N. Y. Patients were 
selected whose admission diagnosis and past medical 
history were reasonably devoid of psychosomatic 
etiology. ‘The age range was from 20 to 40 years. 

The 108 schizophrenic Ss were selected from the 
VA Neuropsychiatric Hospital at Canandaigua, N. Y. 
These individuals had been diagnosed and were being 
treated as schizophrenics. Only those patients were used 
whose diagnoses were uncomplicated by neurological 
involvement. As with the normals, the age range was 
from 20 to 40 years. For this sample, the additional 
criteria of contact and cooperativeness were necessary. 
No systematic differences were noted between sub- 
groups with regard to such variabies as age, type of 
schizophrenia, educational level, or vocational level. 
For both normals and schizophrenics, the following 
proportions of the groups obtained the Wechsler- 
Bellevue vocabulary weighted score units as specified: 
13—6%; 12—12%; 11—28%; 10—33%; 9—18%; 
8—6%. The average weighted score for both popula- 
tions was 10.39. 


Tests and Apparatus 


In addition to the vocabulary subtest from the 
Wechsler-Bellevue, two concept formation tests were 
administered to each S in the study. These tests were 
essentially the same as those which Whiteman (22) 
found to be most effective in differentiating between 
schizophrenic and nonpsychotic groups. The formal 
concepts are defined by Whiteman as reflecting non- 
interpersonal stimuli which can be described in physical, 
psychophysical, or quantitative terms such as volume 
or number, or in logical-relational terms as with con- 
cepts based on principles of hierarchical classification 
(e.g., species, genus) or relational concepts such as 
“middleness.” This test is of 11 items with 
each item consisting of five 3” by 3” cards. The S’s task 
is to place each set of five cards in a proper sequence 
according to a logical-relational concept such as arrange- 
ment according to size of the box shown in each picture 
(Item 1) or according to the period of architecture noted 
in each picture (Item 7). The items are scored for time 
and accuracy, with a maximum time of 90 seconds per 
item. 

The social concepts are described by Whiteman (21) 
as representing abstractions common to a number of 
situations involving human interaction. This test is 
composed of 21 items, 18 of which contain four 3” by 3” 
stimulus cards, while three of the items contain six 
stimulus cards. In all items, three of the cards depicted 
instances of one social concept, whereas one (or three) of 
the cards did not depict this concept. In executing the 
social concept items, S’s task was to separate the three 
cards that had the same idea from those that did not 
have that particular idea. For example, in the “rescue” 
item (Item 1), the three concept cards included (a) 
a scene depicting a fireman rescuing a person from a 
burning building, (6) a picture of a man rescuing a girl 
from the path of an advancing truck, and (c) a picture of 
a man throwing a life preserver to a swimmer in distress. 
The nonconcept card was that of a scene showing two 
children on sleds racing toward a busy intersection. The 
items were scored for both time and accuracy, with a 
maximum time of 90 seconds for the four-card items 
and 120 seconds for the six-card items. 


The apparatus which supplied the aversive stimuli 
described below consisted of a tape recorder and ear 
phones. The output of a white noise generator was 
recorded on a one-hour tape. The generator was set at 
an intensity level of 116 decibels above absolute zero 
(10“* watts per square centimeter) with random 
frequencies from 60 to 10,000 cycles per second (limited 
by recording apparatus and earphones). This level of 
noise was that selected by Pascal and Swenson (17) in 
the previously described study and was reported by 
them as being disturbing but not injurious. When ad- 
ministered, the noise began at the time of presentation 
of a given test item and was terminated upon either the 
correct completion of the item by the S or the expiration 
of the maximum time limit for the item. 


Procedure 

On the basis of preliminary screening, interviews, 
and testing, a pool of available Ss was established for 
each of the groups. From this aggregate, Ss were 
assigned to experimental! and control groups in a zandom 
manner, stratified according to intelligence level. The 
four groups may be defined as follows: 1. Normal— 
Control (NC) Group: 18 normal (nonpsychiatric) Ss 
who received the concept formation tasks under the 
usual conditions for the administration of 
tests. 2. Normal—Noise (NN) Group: 18 normals who 
received the tests in the presence of white noise which 
was terminated at the successful completion of each test 
item or the expiration of the maximum time limit for 
each item. 3. Schizophrenic—Control (SC) Group: 54 
schizophrenics who received the tests under conditions 
similar to those of Group NC. 4. Schizophrenic—Noise 
(SN) Group: 54 schizophrenics who received the tests 
under conditions similar to those of Group NN. 

The order of presentation of the tests was randomized 
throughout the testing of all samples. The instructions 
given to the S at the time of testing were those used by 
Whiteman (21), consisting of a brief description of the 
tests and the definition of S’s task as previously noted. 
For those Ss who received the tests under the noise con- 
ditions, additional instructions were given. As in Pascal 
and Swenson’s study (17), the S was told that the 
experimenter was interested in seeing how well S could 
perform under conditions of distraction. It was indi- 
cated that if a particular arrangement of items was the 
incorrect solution, the noise would continue until S 
formed the correct solution. Similarly, in the nonnoise 
groups, Ss were informed of incorrect solutions and 
urged to rework the problems. On completion of a test 
item, S was asked to verbalize the idea present in his 
arrangement of the cards. Inability to verbalize a 
realistic interpretation of the concept resulted in the 
item being scored as though it had not been completed. 

Two scoring systems were used in the analysis of the 
Ss’ performance on the tasks. The first scoring system 
(S) gave three credits for an appropriate response within 
the time limit, regardless of the number of corrections 
made. The second scoring system (B) added time bonus 
credits to the procedure of Scoring System S. System S 
was intended to reflect whether or not the concepts 
were available to the individuals, whereas System B was 
directed toward the problem of how easily the con- 
cepts were available. 
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RESULTS 


An A X B X L analysis of variance design 
(15) was used in the present study. “A” repre- 
sents the population variable (schizophrenic 
and normal), and “B” signifies the experi- 
mental conditions (noise and nornoise). “L” 
represents the control variable of intelligence 
introduced to achieve higher precision in the 
evaluation of A and B. Six of these analyses 
are included in the present report, representing 
scores of the Ss to each of the tests by itself 
as well as total scores (social plus formal) for 
each S. The scores were computed by means of 
Scoring Systems S and B described above. A 
relatively low correlation was found between 
Systems S and B (r = .28). 

The results of the six A X B X L analyses 
are summarized in Tuble 1. In all cases, the 
SN group approximated the performance level 
of the normals, and the SC group performed at 
a level significantly lower than the other three. 
On the formal test (Scoring System B), the low 
performance of the SC group was offset by the 
very high performance of the SN group, so 
a significant difference between psychotic and 
normals was not obtained. 

In all analyses, the variable of noise versus 
nonnoise was found to display significant 
effects, as was the case with the control variable 
of intelligence. The interaction of conditions 
and populations was significant in four of the 
six analyses. In the case of the social and total 
scores with Scoring System S, this interaction 
was not significant. In these cases, the effects 
of noise with the schizophrenics were not as 
pronounced as in the other analyses, although 
the schizophrenic groups performed at signifi- 


TABLE 1 
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cantly different levels, and the schizophrenic 
noise group was not significantly differentiated 
from the normals. 

The populations X intelligence interaction 
was not significant in any of the analyses, as 
was the case for the populations X conditions 
X intelligence interaction. In two cases (the 
social test in both scoring systems), the condi- 
tions X intelligence interaction was not 
significant. Inspection of the data in the other 
four analyses revealed that this interaction was 
significant with the forma! test, which resulted 
in significance also being found for the total 
scores with this interaction. Evaluation of the 
formal test data suggested that although the 
effects of noise were not consistent at all intel- 
ligence levels, no systematic trends could be 
found to account for this lack of consistency. 


DISCUSSION 


The schizophrenics performed at a level 
significantly inferior to normals under the 
usual testing conditions. These results indi- 
cate that the concept formation tests developed 
by Whiteman are of value in differentiating 
between normals and schizophrenics. The 
schizophrenics who took the tests in the 
presence of the white noise which was termi- 
nated upon the formation of correct responses 
consistently performed at a level superior to 
that of the schizophrenics under usual condi- 
tions. In fact, they approximated the func- 
tioning of the normal groups. This finding 
serves to re-emphasize the importance of the 
distinction between samples of performance 
and inferences about capacity levels with 
schizophrenics. This result is consistent with 
that reported by Cohen (7) and by Pascal and 
Swenson (17) and the theoretical position of 
Hunt and Cofer noted previously. The finding 
that the normal groups were not differentiated 
by the motivation conditions is also reported 
by Cohen. His interpretation is that normals 
work near their limit of proficiency under 
ordinary testing conditions. 

Under conditions of increased motivation, 
schizophrenics attained the performance level 
of normals on both social and formal test 
material. The similarities between the analyses 
based on the two scoring systems suggest that 
the concepts were not only available to the 
schizophrenics but seemingly were as easily 
available to them as to the normals. Increased 
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decrements with schizophrenics on social as 
opposed to formal concepts, as reported by 
Whiteman (22), were not apparent for the 
SN group of the present study. Within the 
present design, it was not feasible to compare 
directly the performance of the SC group on 
the two types of task content. 

The results of the present study do not 
support the relative permanence and irreversi- 
bility of cognitive decrement indicated by 
Goldstein and other organically oriented 
theorists. The results also fail to support the 
differential deficit and progressive decrement in 
capacity for appropriate thinking implied by 
Cameron (3, 4, 5, 6). It would appear that 
normal, socially acceptable thinking may be 
relaiively easily available to the schizophrenic, 
at least for schizophrenics who meet the 
criteria of contact and cooperativeness neces- 
sary for the present study. 

The improvement in schizophrenic per- 
formance under the conditions of the present 
study refers only to a relatively brief and 
temporary improvement in a limited area of 
cognitive functioning. If subsequent research 
bears out the present findings in other areas of 
functioning and with more disoriented patients, 
a variety of implications are evident. In the 
treatment of schizophrenia, for example, it 
may be that efforts to modify the motivation 
of schizophrenics will prove more fruitful than 
the more usual attempt to retrain the patient’s 
disorganized thinking. 

The suggestion that schizophrenics suffer 
from decreased motivation, evident in the 
present study, is present in several theories of 
abnormal behavior. Theorists such as Dollard 
and Miller (8) and Mowrer (16) are representa- 
tive of writers who describe personality malad- 
justment in terms of the patient’s unsuccessful 
attempts to cope with anxiety. Within this 
framework, aspects of schizophrenic behavior 
are said to connote defensive efforts oriented 
around the attempt to reduce anxiety. With- 
drawa! from anxiety-arousing situations, avoid- 
ance of the stresses of adult environmental 
demands and a retreat from reality, become 
useful media for reduction of anxiety in the 
schizophrenic. In the absence of more appropri- 
ate methods of coping with anxiety, the schizo- 
phrenic’s adjustment appears relatively more 
satisfactory to him than does his former adjust- 
ment. Changes in his present motivation 
toward relinquishing schizophrenic defenses 
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would thus appear as a prerequisite for a return 
to a more adequate adjustment. 

In addition to research which might help to 
define the limits of hypotheses noted in the 
present report and to explore the generality 
of the present observations, investigations of 
motivation as related to avoidance learning or 
reward learning paradigms, for example, might 
lead to even more emphatic effects on schizo- 
phrenic behavior. 


SUMMARY 


This study tested the effects of escape from 
aversive stimuli conditions upon the _per- 
formance of schizophrenics and normals. 
Concept formation tasks involving social and 
formal concepts were used as measures of 
performance. The schizophrenics and normals 
were each divided into two groups, Noise and 
Control, and matched on an_ intelligence 
measure. The concept formation tests were 
then administered. The control groups re- 
ceived the tests under the usual conditions 
of psychological testing. White noise was intro- 
duced through earphones to the Ss of the 
Noise groups at the presentation of each test 
item. The noise continued until the item was 
completed or the maximum time iimit of the 
item was exceeded. 

Under the conditions of escape from aversive 
stimuli, schizophrenics approximated the per- 
formance level of the normals with both types 
of test content. Schizophrenics who took the 
concept formation tests under the usual condi- 
tions of psychological testing performed at a 
level both inferior to and significantly differ- 
entiated from the normals and the more 
motivated schizophrenics. 

The findings were interpreted in terms of an 
experimentally induced increase in schizo- 
phrenic motivation, accompanied by a tem- 
porary relinquishing of schizophrenic defenses 
and consequent performance at a level more 
representative of potential ability than that 
elicited in the usual testing situations. Sug- 
gestions for further evaluation of the role of 
motivation in schizophrenia were made. 


REFERENCES 


1. Benjamin, J. D. A method for distinguishing and 
evaluating formal thinking disorders in schizo- 
phrenia. In J. S. Kasanin (Ed.), Language and 
thought in schizophrenia. Berkeley: Univer. of 
California Press, 1946. Pp. 65-90. 





Davin K. 


Boties, M. M., & Goupstern, K. A study of the 


impairment of “abstract behavior” in schizo 

phrenic patients. Psychiat. Quart., 1938, 12, 
42-65 

“AMERON, N. Reasoning, regression, and communi 
cation in schizophrenics. Psychol. Monogr., 1938, 
50, No. 1 (Whole No. 221 

CAMERON, N 
schizophrenic thinking. J. abnorm. soc. Psychol., 
1939, 34, 265-270 

“AMERON, N. Schizophrenic thinking in a problem 


1939, 85, 1012 


Deterioration 11 regression in 


solving situation. J. ment. Sci., 
1035 

“AMERON, N. The fu 
McV. Hunt (Ed.), Personality and the behavior 
disorders. Vol. 11. New York: Ronald, 1944. Pp 
861-921 
‘onEN, B. D 
schizophrenia. J. abnorm. soc 
§2, 186-190 

Doxtarp, J., & Miter, N. E. Personality and 
psychotherapy. New York: McGraw-Hill, 1950 

GOLDSTEIN, K 
tests for diagnosis and prognosis in schizophrenia 
imer. J. Psvchiat., 1939-1940, 96, 575-578 

Gowpstern, K. Methodological approach to the 
study of schizophrenic thought disorder. In J. S 
Kasanin (Ed Language and thought in schizo 
phrenia. Berkeley: Univer. of California Press, 
1946. Pp. 17-40 

Gotpstein, K. & ScHEERER, N 
concrete behavior. Psychol V onoer " 


No. 2 (Whole No. 239 


tional psychoses. In J 


Motivation and performance in 
Psychol., 1956, 


The significance of special mental 


Abstract an 


1941, 53, 


CAVANAUGH 


2. HANFMANN, Evocewia, & Kasantn, J. Conceptual 


thinking in schizophrenia. Nerv. ment. Dis 
Monogr. 1942, No. 67 

Hunt, J. McV., & Corer, C. N. Psychological 
deficit. In J. McV. Hunt (Ed.), Personality and 
the behavior disorders. Vol. 11. New York: Ronald, 
1944. Pp. 971-1032 

Huston, P. E., & SHaxow, D. Learning in schizo 
phrenia. J. Pers., 1948, 17, 52-74. 

Linpquist, E. F. Design and analysts of experiments 
in psychology and education. Boston: Houghton 
Mifilin, 1953. 

Mowrer, O. H. Learning theory and personality 
dynamics. New York: Ronald Press, 1950. 

Pascat, G. R., & Swenson, C. Learning in men 
tally ill patients under conditions of unusual 
motivation. J. Pers., 1952, 21, 240-249 

Suaxow, D. The nature of deterioration in schizo 
phrenic conditions. Nerv. ment. Dis. Monegr. 
Serv., 1946, 70, 1-88. 

Vicotsky, L. Thought in schizophrenia. Arch 
Veurol. Psychiat., 1934, 31, 1063-1077 

Wecrocki, H. Generalizing ability in schizo 
phrenia. Arch. Psychol., 1940, No. 254. 

Wuiteman, M. The performance of schizophrenics 
on social concepts. Unpublished doctoral dis 
sertation, New York Univer., 1952 

Wuiteman, M. The performance of schizophrenics 
on social concepts. J. abnorm. soc. Psychol., 1954, 
49, 266-271 


Received July 3, 1957 








SOME CLINICAL CORRELATES OF OPERANT BEHAVIOR' 


MARTHA T. MEDNICK anp OGDEN R. LINDSLEY 
Harvard Medical School 


HEN techniques of operant condi- 

tioning are adapted to investigate 

the behavior of chronic psychotic 
patients (5, 6), the subject (S) is placed inva 
small room equipped with a lever which can 
be manipulated in order to obtain various 
types of reinforcements. Appropriate brief 
verbal instructions are given. The intensive 
study of approximately 50 Ss recently revealed 
that there are great individual differences in 
the rate at which a patient pulls a lever in 
order to obtain a particular reinforcement. In 
searching for an explanation of these differ- 
ences, it has been found that the rate of re- 
sponse is not related to admission diagnosis, 
intelligence quotient, age, or total time of 
hospitalization (10). On the basis of an inspec- 
tion of the data and an extremely gross rating 
of patients’ clinical behavior, however, the 
rate of response appeared to be directly related 
to “depth of psychosis or severity of illness” 
(10). An attempt to verify this relationship by 
King ef al. (2) with aculely ill psychotic pa- 
tients was in the predicted direction but not 
statistically significant. They observed a con- 
cave downward curvilinear relationship be- 
tween severity of illness and rate of operant 
responding. The group evidencing medium 
severity of neuropsychiatric illness responded 
at the highest operant rate. 

The present study was primarily designed to 
examine systematically the relationship of 
severity of illness to experimental performance 
in a chronic patient population. 


METHOD 


Twenty-two male chronic psychotic patients, hos- 
pitalized from 3 to 47 years with a median of 16 years, 
and six male hospital attendants were the Ss used. 
These Ss were all taking part in an ongoing study in 
operant conditioning. The patient group had had from 
10 to 448 hour-long sessions of experience in the ex- 


1 The work reported in this paper was accomplished 
under Contracts N5-ori-07662 and Nonr-1866(18), 
sponsored by the Group Psychology Branch, Office of 
Naval Research and under research grant MH-977 
from the National Institute of Mental Health of the 
National Institutes of Health, Public Health Service. 
This paper was written while the senior author was a 
USPHS Post-doctoral Research Fellow at the Harvard 
Medical School. 


perimental room, with a median of 18 sessions. Each 
normal was run for 50 hours and was then placed on 
extinction. Briefly, the operant conditioning session 
consisted of placing each S aione for one hour in a 
small room equipped with a vending machine device 
Figure 1 depicts this experimental situation. The S 
was given verbal instructions to the effect that if he 
operated the lever he would get candy (or some other 
reinforcement). A detailed statement of method is 
given by Lindsley (5). The Ss in this study were rein- 
forced with a mixture of penny candy and cigarettes. 
The reinforcements were delivered on a one-minute 
variable—interval schedule (1'VI), i.e., an average of 
one reinforcement per minute. 

The Lucero-Meyer Fergus Falls Behavior Sheet 
(LMBS) (7) was chosen to rate the ward behavior of 
the patients. This scale is brief and simple, dealing 
with observable hospital behavior. It has been used 
with a chronic population and seems sensitive to dif 
ferences within such a group while being independent 
of psychiatric diagnosis. In addition, changes in a 
patient’s behavior can be picked up over a short length 
of time. The authors of the scale report reliability 
coefficients of .92 and .94. In the current study, no 
further check of reliability was made. 

A brief diagnostic test battery was also given. A 
short form of the Wechsler-Bellevue, Form I, consist- 
ing of Vocabulary, Comprehension, Block Design, and 
Picture Completion (C-BD-V-PC) was administered. 
The C-BD-V-PC was found by Patterson to correlate 
.96 with the full scale WB-I on a heterogeneous psy- 
chiatric population (8). If S was untestable by this 
form of the Wechsler-Bellevue, the Ammons Full 
Range Picture Vocabulary Test (APV), Form A, was 
attempted. In addition to the intellectual measure, the 
Rorschach was administered. The final clinical index 
obtained was the Tulane Psychological Test Behavior 
Rating Scale (TBS) developed by King (3). This scale 
represents an attempt io quantify the quality of be- 
havior observed during psychological testing. The TBS 
was filled out by the clinical psychologist following 
each testing session. The operant behavior data were 
not known to the clinical psychologist until the pa- 
tients had all been tested. 


RESULTS 
Operant Data 


Three response measures were used in the 
analysis of the operant conditioning data: 
the rate of response (R/Hr.), the total number 
of inter-response times greater than 10 sec. 
( IRT > 10”), and the sum of the inter- 
response times greater than 10 sec. > IRT 
> 10”), i.e., the total amount of time during 
which § did not respond. These indices are 
described in detail by Lindsiey (4). The latter 
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Fic. 1. Sranparp Apparatus Usep FOR OPERANT 
CONDITIONING OF PsycHuoTtic Patients; S PuLts 
THE KNOB ON THE LEFT OF THE PANEL AND IS 
REINFORCED BY THE INTERMITTENT PRESEN 
TATION OF CANDY AND CIGARETTES IN THE 
MAGAZINE CHUTE ON THE RiGut (DE 


SCRIBED IN DETAIL IN [6] AND [4}) 


two measure intra-hour variability. The pres- 
ent paper will be concerned in the main with 
R/Hr. data. 

Table 1 summarizes the operant conditioning 
data for testable and untestable patients and 
the normal Ss. The data presented represent 
the median of the first ten hours in the study 
for each S in each group. It should be noted 
that the #IRT > 10sec. and the. IRT > 
10 sec. must be compared in relation to each 
other. Thus, the untestable patients have few 
pauses but do not respond for almost the 
entire hour, while the testable patients take 
numerous short breaks. The normals only 
occasionally take a short break. It is clear that 
the two patient groups differ from each other 
as well as from the normals with respect to 
responsivity and intra-hour variability. 


Behavior Ratings 


The LMBS consists of 11 short scales, each 


pertaining to a particular kind of hospital 
behavior. Within each scale are five descrip- 
tions of behavior, ranging from most to least 
disturbed. The rater is instructed to place a 
check mark next to the description which best 
describes the patient. Each description receives 
a numerical score from one for the most dis- 


turbed type of behavior to five for the least 
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TABLE 1 
RANT CONDITIONING RESPONSI 
TESTABLE, UNTESTABLE, AND 

median values 


MEASURES FOR 
NORMAL Ss 


IRT 10” 
in min 


R/Hr #IRT 10” 


Untestable 19 7) 
Testable 1,421 44 
Normals 9 566 


Note 
three groups were made by the Finney 2 X 2 contingency tables 
i median > 
edian 2 IRT 


10°. The difference between testable and unt able 


Statistical evaluation of ncees between the 


All groups differed significantly (p < 
mm mediar 


R/Hr was significant p O4 


TABLE 2 
RatinG Data DEscriBinG UNTESTABLE 
TESTABLE, AND NORMAL Groups 


Test AND 
median values 

Ratings 

TBS LMBS 


Untestable 
Testable S4 
Normal 5 107 


* IQ for testable patient Ss was on 


or APV, Form A 


disturbed. The scores for all scales are averaged 
to obtain the total LMBS rating. There is thus 
a possible range of 1.0 to 5.0. The LMBS 
ratings of the patient group in this study 
ranged from 1.3 to 3.8 with a median of 2.3. 
These values are comparable to the mean 
score of 2.5 and a range of 1.3 to 4.7 reported 
by Lucero and Meyer in their account of 
construction of the scale (7). 

With respect to the relation of LMBS ratings 
to operant behavior, the median rate of re- 
sponse for the first ten hours was found to 
have a rank difference correlation of 
significant) with the median LMBS ratings. 
The rank difference correlation between the 
LMBS rating and the median rate of response 
for the ten hours nearest 


.23 (not 


to the date of the 
rating, however, was .81, significant at the 
001 level. 


Individual Testing 

Individual testing was attempted with all 
the patients in the group. Twelve of these 
patients were completely untestable. Six C- 
BD-V-PCs were administered and four APVs. 
All of the ten testable patients were able to 
perform at least the free association to the 
Rorschach. The test data, presented in Table 2, 
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reflect the extreme chronicity of the popula- 
tion. The test data for the normal Ss are also 
presented. 

The Test Behavior Rating Scale (TBS) was 
scored by the tester. It consists of 12 subscales, 
each having a possible range of scores of 0 to 
40. A final score is obtained by summing all 
subscale scores so that the total score can 
range from 0 to 480. A score of 480 would 
imply a very good adjustment to the testing 
situation. The TBS ratings for the patients 
ranged from 0 to 336, with a median of 95. 
The original samples reported by King (3) in 
his development of the scale received mean 
scores from 299 to 399. It would seem, then, 
that the present population as a whole was 
more disturbed and less accessible than the 
Tulane patients.* 

The rank difference correlation of the TBS 
and R/Hr. of the first 10 hours was .13, while 
R/Hr. for the 10 hours nearest to the date of 
testing correlated .39 with TBS. Neither of 
these coefficients is significant, although it 
should be noted that the correlational magni- 
tude increased in the same general manner as 
in the case of the LMBS ratings. A significant 
direct relationship (p < .04) was found be- 
tween sheer testability and rate of response, 
using the Finney 2 X 2 contingency tables 
(1). Similarly, high and low scores on the TBS 
were significantly related to R/Hr. (Finney 
contingency tables, » > .019). This is essen- 
tially a repetition of the testability-untestabil- 
ity findings, since high TBS scores were the 
testable patients and low scorers the untestable 
ones. A chi-square test was also performed on 
each subscale of the TBS. Two of the twelve 
subscales showed a significant relationship with 
the rate of response near testing date. These 
two were “ability to follow instructions” and 
“willingness to accept the task as an activity.” 
The former verifies the impression, stated in 
an earlier project report, that those patients 
who are able to understand simple directions 
are also high operant responders (10). The 
subscales of the LMBS were similarly ana- 
lyzed, but none of these proved significant. 


DISCUSSION 


This study strongly suggests that the oper- 


* A communication with King has indicated that his 
ratings of detailed descriptions of our patients coin- 
cide with those assigned by the present investigators. 


ant conditioning performance of chronic psy- 
chotic patients and certain clinical variables 
are related. The earlier impression of a positive 
relationship between rate of response and 
severity of illness was supported. It is meaning- 
ful to look at the results in terms of a notion 
of adaptability to the limited social and 
physical demands of the hospital environment. 
All of the patients in the present sample can 
be described as severely ill; the length of 
hospitalization bears this out. While they all 
require institutionalization, they nevertheless 
differ widely in their ability to adjust within 
such a setting. The LMBS may be considered 
an index of this ability, and its significant 
relationship to operant conditioning may be 
refiective of a general adjustive ability. Indi- 
viduals (in a chronic group) who are high 
operant responders may be those who remain 
sensitive to rewards in the social environment 
and can learn to manipulate this environment 
so as to obtain some of these rewards. The 
relationship of operant performance to clinical 
testability further supports this notion. Again, 
the high operant responders are those who can 
adjust to a type of environment demand. 

The fact that the correlations of clinical 
ratings and operant responsivity are raised by 
utilizing R/Hr. nearest to the date of ratings 
is of interest. This would seem to reflect 
sensitivity of operant performance to clinically 
observed changes in the patient. 


SUMMARY 


Twenty-two chronic psychotic patients, Ss 
in an operant conditioning study, were given 
psychological tests and rated as to their ward 
behavior. Ratings of ward behavior and rates 
of operant response were directly related. 
Those patients who were testable by at least 
one clinical test were those who were high 
operant responders. These findings were dis- 
cussed in terms of a notion of adaptability to 
the demands of the hospital environment. 

Clinical and operant data on six normal Ss 
were also presented. 
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THE EFFECT OF PERSON-GROUP RELATIONSHIPS ON 
CONFORMITY PROCESSES' 


JAY M. JACKSON anp HERBERT D. SALTZSTEIN 
University of Michigan 


ANY previous studies of conformity 
M behavior—among them the well- 
known experiments of Sherif and 
Asch (1, 2, 11)—have employed 
situations where the forces to conform were 
essentially cognitive in origin. A person in a 
situation of judgment or choice is confronted 
with contradictory information from two 
different sources. He has to weigh the evidence 
provided by his own perception of the stimulus 
against his knowledge of the actions or judg- 
ments of other persons. Festinger (5) has 
pointed out that a person will have a need for 
social reality, i.e., a need to depend upon in- 
formation provided directly or indirectly by 
others, to the degree that his information from 
so-called physical sources is inadequate. 

Forces to conform which are created by a 
person’s need for social reality have their 
source in his desire to make an appropriate 
rather than an inappropriate response, or to 
perceive the world accurately rather than 
inaccurately. This process of coming to cogni- 
tive terms with his environment can be distin- 
guished from other processes of conformity 
that are influenced by the person’s member- 
ship or nonmembership in a group, the strength 
of his attraction to membership, or the rele- 
vance of the situation to the goals of the 
group. 

In experiments by Festinger, Thibaut, 
Schachter, and others (5, 6, 10), forces to 
conform were created which derived from 
pressures towards uniformity in a problem- 
solving group. These forces, generated by a 
process called group locomotion, are induced 
upon all persons who belong to the group and, 
especially, upon any deviant who may be 
blocking the group’s progress toward its goal. 
The more attractive a group is for a member, 
the stronger are the forces from this source 
acting upon him to conform (3, 5). 

1 This study was conducted under Contract Nonr- 
1224(11) with the Office of Naval Research. A mimeo- 
graphed report containing the detailed instruments and 
procedures is available elsewhere (9). This article is 


based in part on a report to the annual meetings of the 
American Psychological Associaticn, September 1956. 
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Forces to conform that arise out of the 
necessity for group locomotion clearly depend 
upon the existence of a group of interdependent 
persons who reciprocally expect each member 
to help rather than hinder the group. These 
forces should exist only in areas that are 
relevant to the group’s goal-achievement. 
They will be induced upon all those accepted 
as members of the group,’ but not upon non- 
members. They will be strongest for those 
members who are most identified with the 
group and its objectives. 

This paper reports an experiment designed 
to test hypotheses derived from these theo- 
retical assumptions about social reality forces 
znd group locomotion forces to conform.? 


Normative and Modal Situations 

Implicit in the previous discussion is a 
distinction between two types of situation in 
which conformity behavior can occur. A norma- 
live situation is one where group members are 
interdependently working towards a common 
goal, and consensus therefore takes on a pre- 
scriptive or normative value for members. A 
modal situation is one where either there is no 
common goal, or the task is not relevant to 
group achievement. Thus any agreement of 
opinion or judgment is not the result of recipro- 
cal patterns of influence, but a mere coinci- 
dence of events, a piling up of responses at 
the mode. 


Group Membership and Person-Group Rela- 

tionships 

Let us consider four types of relationship 
that occur between a person and a group? 

1. Psychological Membership (high accept- 
ance, high attraction): The person is highly 
accepted as a member of the group by other 
members. They expect him to adhere to a 
member’s role, conform to the norms of the 


? This distinction is similar to that made by Deutsch 
and Gerard, who speak of informational and normative 
social influences upon an individual’s judgment (4). 

*It is pointed out elsewhere (8) that these person- 
group relationships are not discrete classes, but seg- 
ments of a space. 
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group, and contribute to its goals. The person 
is motivated to take the role of member, thus 
highly attracted to membership in the group. 
He is both treated by others like a member of 
the group, and desirous of being a member. 

2. Psychological Nonmembership (low accep- 
tance, low attraction): The person has little 
or no acceptance as a member by other mem- 
bers, for whatever reason. He also has little 
or no attraction to membership in the group. 
Under these conditions he does not “belong” 
psychologically to the group. 

3. Preference Group Relationship (low accep- 
tance, high attraction): In this relationship, the 
person has little or no acceptance as a member 
by the other members. But his attraction to 
the group is highly positive, implying that he 
perceives the group as capable of fulfilling, 
directly or indirectly, some of his needs. He 
would prefer to belong to the group, but is not 
a member. 

4. Marginal Group Relationship (high accep- 
tance, low attraction): The other members of 
the group accept the person as a member. But 
he has little or no attraction to membership. 
He is considered to be a member by the others, 
but he is not motivated to take the role of 
member. 


Summary Statement of Theory 


The discussion has dealt with forces to con- 
form which have their source in the need for 
social reality and in the necessity for group 
locomotion in a goal-oriented group. It is 
possible to summarize the reasoning in the 
form of two assumptions that specify the 
relative magnitude of these forces to conform 
that will act upon a person in each of the four 
person-group relationships, in both normative 
and normal situations: 

I. Forces to conform arising out of a need for 
social reality will have equal magnitude in both 
normative and modal situations, and will not 
vary with the person-group relationship. 

Il. Forces to conform arising out of the 
necessity for group locomotion will exist only for 
positively accepted persons in a normative 
situation, and will be positively related in magni- 
tude to such persons’ attraction to the group. 

In deriving hypotheses from these state- 
ments, we further assumed that forces to 
conform which act upon a given person in a 
particular situation are additive, and that his 


TABLE 1 
SourcEs AND MAaGNiTupEs oF Forces To ConrorM 
FOR Four Person-Group RELATIONSHIPS UNDER 
Two Task CONDITIONS 








Modal 


Normative Moda 
Situation 


Experimental 
Situation 


Person-group 
Conditions 


Relationship 





social reality social real- 
+ high ity 
group loco- 
motion 


(a) (e) 


(high attraction, 
high accept- 
ance) 


Psychological 
Membership 


social reality 
+ low group; _ ity 
locomotion 
(b) (f) 


(low attraction, 
high accept- 
ance) 


(high attraction, | social reality 


nonacceptance) 


Preference 
Group Rela- 
tionship 

(c) (g) 


(low attraction, | social reality 
nonacceptance) ity 


Psychological 
Nonmember- 
ship 











(d) (h) 





conformity behavior is determined by the sum 
of all forces to conform acting upon him in that 
situation. 

Hypothesis 1. Conformity behavior of Mem- 
bers* will be greater in magnitude in a normative 
than in a modal situation. The derivation of 
this prediction is extremely simple: social 
reality forces to conform acting upon Members 
have equal magnitude in both normative and 
modal situations (from I), but group locomo- 
tion forces act upon Members only in a norma- 
tive situation (from II). The total force to 
conform acting upon Members will be greater, 
therefore, in the normative than in the modal 
situation. 

Hypothesis 2. Conformity behavior of Margi- 
nals will be greater in magnitude in a normative 
than in a modal situation. The derivation of 
this and other hypotheses will be apparent in 
Table 1, where are summarized the relative 
magnitudes of forces to conform which act 
upon persons in the four person-group rela- 
tionships and two situations, as postulated in 
our two basic assumptions. 

Using the letters in this table for convenient 
designation, the first hypothesis can also be 
stated: conformity behavior in (a) > (e), or 


{ 1] CB, > CB, 


‘Persons in the four relationships will be referred 
to as: Members, Marginals, Preferences, and Nonmem- 
bers. 





Person-GrovuP RELATIONS AND CONFORMITY 


The second hypothesis, similarly, can be 
stated: 


[2] CB, > CBr 


By making comparisons between pairs of 
cells in Table 1, eleven other hypotheses, 
summarized below, were derived from our two 
basic assumptions and tested in this experi- 
ment. 


CB, 
CBr 
CB, 
CB, 


CB, 
CB, 
CB, 
CB, 
CB, 
CB, 
CB, 
CBy 
CBy 
CBy 
CB, 


VVVVVVVVVVV 


METHOD 
Subjects 


The Ss were 100 male undergraduate volunteers, 
randomly assigned to 21 groups of four or ‘five persons. 
During or immediately after the experiment, 14 of the 
Ss indicated they had seen through some of the decep- 
tions. The analysis is based upon data from the remain- 
ing 86 Ss. 


Experimental Design 


In one half of the groups, Ss were in a high atiraction 
condition and in the other half a low‘ atiraction condi- 
tion. In each group there was one S who was not ac- 
cepted as a member by the others, each of whom was 
highly accepted by all. Thus, Aigh acceptance and non- 
acceptance conditions were created. Each experimental 
session consisted of two periods, one under normative 
conditions and one under modal, with the normative 
always coming first. It would have been preferable to 
control on this sequence, but it proved impracticable 
to begin with the modal condition: this involved hav- 
ing a person unacceptable as a member of a “group” 
in which everyone works on his own. In each period, 
an experiment..| task was performed by the Ss and 
data obtained about their conformity behavior. Ques- 
tionnaires to validate the experimental conditions and 
to obtain additional data were administered immedi- 
ately before and after each experimental period. 





5 Since all thirteen hypotheses are derived from two 
theoretical assumptions, they are clearly not independ- 
ent. No attempt will be made, however, to aver the 
truthfulness of our theory by assessing the probability 
of achieving a given number of significant confirma- 
tions. 


Frc. 1. A “Soctat Vision” Prositem; Susyect Must 
Jovce Wace is THE SHORTEsT PATH 
From “Start” To THE “GOAL” 


Experimenial Task 


The task consisted of a set of 10 “social vision’’ 
problems, each a variation of the problem seen in Fig. 
1. This maze-like figure is a somewhat more compli- 
cated version of Asch’s line problem (1, 2). Ss were 
instructed to find the shortest path from the starting 
point to the black arrow. The experimenter pointed 
out that in each problem it became a choice between 
two of the four possible paths. By varying the posi- 
tion of the arrow around the circumference of the circle, 
26 problems were constructed which varied in their 
degree of difficulty or ambiguity. Each set of 10 prob- 
lems included a range from one extreme to the other, 
and the sets were matched for difficulty. 

Each problem was projected onto a screen about 20 
feet distant from the Ss, who were seated in a semi- 
circle. After a problem had been exposed for three 
seconds, Ss were asked to write unsigned notes to 
other Ss. Each note had to include the sender’s opinion 
as to the correct answer and how confident he was of 
his response. He could also send any remarks he felt 
like writing. The notes were collected for “sorting” and 
after a suitable time interval, substitute notes were 
delivered, with each S receiving identical notes. On 
eight of the ten trials, the notes reported back the 
incorrect answer, held with a high degree of confidence 
by each other member of the group. After Ss had read 
their notes, the same problem was then exposed for a 
second time. This time Ss were asked to write their 
answers upon a special form provided for the purpose. 


Inducing the Experimental Conditions 

In the high atiraction condition, Ss were told that 
they had been matched for congeniality and ability to 
work well together, using information from their volun- 
teer forms. The importance of the study was stressed 
to them. They were told about cash prizes that could 
be won. They were also told that each would have an 
opportunity later to learn more about the study and 
its results. In the low attraction condition, Ss were told 
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that we had been unable to match them, that the 
study was relatively unimportant, that they would not 
have an opportunity to learn about the research, and 
that prizes were no longer being awarded. 

The acceptance conditions were induced by the 
following procedure. The experimenter announced that 
he had one more person than was needed, since all 
groups had to be a standard size. This had happened 
several times previously, so a procedure had been 
developed to eliminate one person “in a way that will 
be fair to all.” Ss were given a preliminary set of 
“social vision” problems, and spurious scores were 
announced, with all except a randomly selected person 
obtaining uniformly high scores, and the latter receiv- 
ing a much lower score. Ss were then asked to indicate 
on a ballot which person they wished to exclude. The 
experimenter, after scanning the ballots, announced 
that the randomly selected person was the person to 
be excluded. The nonaccepted person, who usually rose 
to leave at this point, was then asked by the experi- 
menter to remain and participate in the task, because 
“later on everyone will be working on individual tasks, 
and we want you to have the same experience and 
practice as the others.” The effect of this deception 
was to have a number of interdependent Ss, who 
thought of themselves as a group about to work for a 
common goal (in the normative condition), and one 
other S, physically located in the group, engaging in 
the same activities but psychologically not accepted 
by the others as a member. Thus, in each group there 
were a number of Ss in the high acceptance condition, 
and one S in the nonacceptance condition. 

The normative condition was created by telling the 
“group members,” i.e., the accepted persons only, that 
they would be given a group score on the task to fol- 
low, and would be compared as a group with other 
groups. The modal condition was created by telling all 
Ss that each would be given an individual score on the 
task to follow and would be compared with all the 
other persons in his “position.” (Each S sat before a 
small table. These “positions” were lettered consecu- 
tively, A, B, C, etc. Subject A was told he was com- 
peting against all the other As in the experiment, sub- 
ject B against all the other Bs, and so on.) Each person 
was on his own, trying to maximize his own score. It 
was emphasized that “there is now no group.” 

Following the modal condition, Ss were toid there 
would be a brief reversion to the normative situation 
for a third set of problems, immediately after a third 
questionnaire had been filled out. This procedure was 
intended to permit us to reconstitute the group, psy- 
chologically, as a frame of reference for asking certain 
questions. There was no third period, however. Instead, 
it was announced that the experimental session was 
over. All deceptions were then explained fully, and 
care taken that Ss had time to express their feelings 
and to receive adequate reassurance and explanation. 
Before leaving, each S made a commitment not to 
divulge the procedures to others. 


RESULTS AND DISCUSSION 
Validation of Experimental Conditions 
Data were available from the questionnaires 


by which we could determine how well we had 
been able to induce the desired conditions. 


TABLE 2 
Frequency Wit Wuaicn Oruers in A Group WERE 
INCLUDED AS MemBeERs By THOSE IN THE “Hic 
ACCEPTANCE” CONDITION PRIOR TO THE 
NorMATIVE TASK 








P Responses on 
Experimental Condition Questionnaire I 


of Recipient 





Not Included Included 





High acceptance 10 181 
Nonacceptance 43 25 





Note.—x? = 10.36; » = <.01. 


TABLE 3 
MEAN ATTRACTION TO THE GROUP 


‘ | 
Experimental | y |Initial Insedtal 








Person-group 
Relationship Conditions 





7.4 
7.7 
7.8 


8.8 
7.8 
8.7 


Psychological 
Membership 
Margina! Group 


(high att., (33 
high acc.) | 
(low att., \35 
high acc.) | 
(high att., (10 

nonacc.) | 
| (low att., 8} 5.6 


Preference Group) 
Psychological 6.9 
Nonmember- 
ship 














| 
nonacc.) | 
| 





1. Acceplance as a member. Our conceptual 
definition of acceptance involved locating the 
accepted person within the power field of the 
group; or more simply, recognizing him to be a 
member of the group. The following question 
was asked immediately preceding the norma- 
tive condition: 


Please indicate what contribution you think that 
each member of the group will make to the total group 
score on the Group Prize task you are about to do. 


Respondents were asked to rank the members 
of the group. An appropriate validation of 
acceptance or nonacceptance is whether or not 
an S was included as a member of the group, 
i.e., was assigned any rank whatsoever on this 
item. In Table 2, the hypothesis of independ- 
ence can be rejected with a high degree of 
confidence. Nonaccepted Ss were included 
as a “member of the group”’ significantly less 
often than accepted Ss. 

Almost the identical question was asked 
after completion of the modal task condition. 
The results from this item were similar to 
those in Table 2, x’ being 12.54, significant at 
<.001 level. It appears that the desired 
difference between high acceptance and non- 
acceptance conditions was created and per- 
sisted throughout the experiment. 





{ 
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2. Altraction to the group. The conceptual 
definition of atiraction was “the resultant force 
acting on an individual to remain in or loco- 
mote into a group.” The following item was 
included in all three questionnaires: 


If you were given the opportunity at this point to 
change to a different group of fellows meeting next 
door, what would you do? 


The possible alternative responses, on a 10- 
point scale, varied from “I’d very much want 
to remain a member of this group,” to “I’d 
want very much to change to another group.” 
The mean scores from this item, presented in 
Table 3, show Ss in the high attraction condi- 
tion having greater preference for their own 
group than those in the low attraction condi- 
tion. When we combined acceptance conditions, 
however, only on the second questionnaire 
was the difference between high attraction 
and low attraction means significant at the 
<.01 level, using the ¢ test. One possible 
explanation for our failure to create larger 
differences in attraction between the two 
high acceptance conditions is that being ac- 
cepted in a group may increase a person’s 
attraction to it. Thus, in spite of our attempt 
to create low attraction for those in the mar- 
ginal group relationship, they do not differ 
much from those in the psychological me:nber- 
ship condition. It appears, therefore, that we 
were successful in creating differences between 
the two attraction conditions, but not very 
large differences, measured by this criterion. 

3. Task conditions: normative and modal. 
After the completion of both experimental 
periods, we asked Ss: “How much do you 
think that the members of the group wanted 
to influence your answers?” on each set of 
problems. In both high and low attraction 
conditions, Ss attributed to others more 
desire to influence them in the normative 
period than in the modal. The differences were 
both significant at the <.001 level, using two- 
tailed ¢ tests. This constitutes indirect evidence 
of the validity of the two task conditions. 


Results for Conformity 

The measure of conformity used was the 
proportion of person-trials on which charge 
to the judgment of the divergent majority 
occurred. For each S on each trial we first 
considered whether or not his initial judgment 
(expressed by the notes be had sent to otkers) 
was the same as or different from that of the 


TABLE 4 


PROPORTION OF TRIALS ON WHIcH CHANGE OccURRED 
TO THE UNANIMOUS DIVERGENT JUDGMENT OF 


THE Majority 








Person-group 
Relationship 


Experimental 
Conditions 


N® 


Norma- 
tive Task 
Condition 


Modal 
Task 
Condition 





Members 
Marginals 
Preferences 


Nonmembers 


high att., 
high acc. 
low att., 
high acc. 
high att., 
nonacc. 
low att., 


.622 (a) 
-492 (b) 
564 (c) 


-523 (d) 


.376 (e) 
-331 
-560 (g) 
-510 (h) 


nonacc. 




















*o< 01. 
** >< .001. 
*** » not significant by inspection. 
* In the Modal condition, both the Marginals and Nonmem- 
bers conditions had one less subject than indicated above. 


spurious unanimous majority judgment, ig- 
noring the degree of confidence expressed.* 
Only those cases where these judgments dif- 
fered were included in the analysis. On each 
trial, an S’s final judgment was then categor- 
ized as changed or did not change to the judg- 
ment of the majority. The results of this 
analysis are presented in Table 4. 

If we compare these results with the predic- 
tive model in Table 1, we find that the data 
correspond to the predictions made for Ss in 
the high acceptance condition but not for those 
in the nonacceptance conditions. The first five 
hypotheses, all concerning the conformity 
behavior of accepted persons, are supported by 
the results in Table 4. Hypotheses 6 to 13, 
however, which make predictions involving 
the conformity behavior of nonaccepted per- 
sons, are pot confirmed by our results. First, 
the implications of the findings regarding 
accepted persons will be discussed and some 
additional results presented that are relevant 
to this discussion. Then we shall undertake 
to explain our failure to predict the conformity 
behavior of nonaccepted persons by a re- 
examination of the assumptions underlying 
the theory and operational definitions. 

Conformity behavior is greater in the norma- 
tive than in the modal condition for both 
Members and Marginals, as predicted by the 
first two hypotheses. Members also conform 

* Our initial analysis was performed using a scale 
measure of conformity which included the degree of 
confidence. This yielded results which, although not 
substantially different from those reported here, were 
less clear. We are indebted to Leon Festinger for sug- 
gesting the later analysis. 
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TABLE 5 


MEAN NoumBER oF Notes SENT PER POSSIBLE 
RECIPIENT 





Modal Task 
Condition 





Normative Task 

Condition 
_Person-growp Re- a a 
‘ationship o/ Sender Accepted Nonac- Acrcpted Nonac- 
Recipi- ceptedre- Recipi- ceptedre- 
ents cipients enis cipients 





8.4 4.5 . 7.9 


4.0 6.3 


Members 


Marginals 7.9 
5.9 
6.0 


Preferences 
Nonmembers 





more than Marginals in the normative condi- 
tion, providing support for the third hypothesis 
(¢ is 1.88 for 66 df, significant at about the .06 
level).’ It follows that Hypothesis 4 is con- 
firmed: Members in the normative condi- 
tion conform more than Marginals in the 
modal (¢ is 3.63 for 66 df, significant at the 
<.001 level of confidence). The fifth hypoth- 
esis, which predicted that Marginals in the 
normative condition would conform more than 
Members in the modal, is also supported, 
although with somewhat less confidence, by 
the results in Table 4 (¢ is 1.63 for 66 df, with 
p = .10). Since these five hypotheses were 
all derived from the two assumptions about 
social reality and group locomotion forces to 
conform, they appear to provide support for 
these theoretical statements. It seems espe- 
cially likely that the latter assumption is 
trustworthy, since it has generated a number 
of successful predictions and can scarcely be 
saddled with the responsibility for failing to 
predict the conformity behavior of nonac- 
cepted persons. 

Some additional findings lend support to 
the group locomotion postulate. If a group of 
interdependent persons induce forces to con- 
form upon all those accepted as members, but 
not upon those unaccepted, in areas relevant 
to group locomotion, this pattern of influence 
attempts should be reflected in the volume 
and direction of communications that were 
sent during the experiment. In Table 5 is 
presented the mean number of notes, adjusted 
for number of possible recipients, sent in each 
experimental condition. In the normative 
condition, both Members and Marginals sent 
far more notes to accepted than to nonac- 
cepted recipients (¢ is 7.09 for Members and 


7 The two-tailed ¢ test was used throughout the 
study, regardless of whether or not the finding was 
predicted in advance. 


10.83 for Marginals, with p < .001 in each 
case). Thus when a group of persons are inter- 
dependent with respect to a common goal, 
they apparently consider others whom they 
accept as members to be more appropriate 
targets of communication than those whom 
they do not accept. 

When forces to conform to others’ inductions 
are acting upon members of a group, it is likely 
that these persons will experience some desire 
to behave in the induced direction. On the 
final questionnaire, Ss were asked the following 
question concerning both the normative and 
modal task situations: 

During the Group-Prize (Position-Prize) set of 
problems, how much tendency did you feel to give the 
same answers as the other fellows? 


The mean ratings (on a 10-point scale) for 
each experimental condition are presented in 
Table 6. Both Members and Marginals felt a 
greater tendency in the normative condition 
than in the modal to give the same answers 
as the other persons in their group. Such was 
clearly not the case for persons in the non- 
accepted conditions. Thus the accepted per- 
sons not only conformed more in the norma- 
tive condition than in the modal, but they were 
also aware of a greater desire to do so, presum- 
ably because, in the normative condition but 
not in the modal, they felt the pressure of 
other members’ expectations. 

Let us turn now to consider the unconfirmed 
hypotheses about the conformity behavior of 
nonaccepted persons. The model in Table 1 
calls for conformity behavior of Preferences 
and Nonmembers to be the same in both 
normative and modal conditions, and not 
different from the conformity behavior of 
Members and Marginals in the modal condi- 
tion. These predictions derive from the as- 
sumption that social reality forces to conform, 
of equal magnitude, are the only such forces 


TABLE 6 


Mean Ratinc oF Own Desire TO Give SAME 
ANSWERS AS OTHERS 








Norma- Modal 


Person-group 
tive Task Task 


Relationship 





Members wa 
Marginals , 4 
Preferences ; .6 
Nonmembers 6 





*p < 0S. 
"*o< O11 
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TABLE 7 
Mean EstimaTE OF Own RELATIVE ABILITY 








Person-group Relationship Mean Rating 





Members 
Margi 


Preferences 
Nonmembers 





acting upon persons in these six cells. An 
examination of Table 4 suggests, however, 
that some unpredicted additional force to 
conform was acting upon Preferences and 
Nonmembers in both normative and modal 
conditions. In all four cells their conformity 
behavior is uniformly higher than was 
anticipated. 

A possible explanation lies in our experi- 
mentai manipulation to create the nonaccept- 
ance condition. By giving the nonaccepted S 
information that his ability on the preliminary 
set of problems was poor, it is likely that his 
confidence in his ability to perform the subse- 
quent tasks was lowered. Hochbaum (7) found 
that persons lacking in self-confidence were 
less able to withstand group pressures to 
uniformity. Where a person has little confi- 
dence in the information provided by his 
own perception of a stimulus, he will have an 
enhanced need for social reality. 

On the final questionnaire Ss were asked 
the following: 


Please compare your own ability on the problems to 
the average ability of the other persons present (ex- 
cluding the two researchers). 


The results of these ratings are presented in 
Table 7. Persons in the nonaccepiance condi- 
tions clearly estimate their ability to be less 
than do those in the high acceptance conditions 
(the difference between Members and Prefer- 
ences is significant at the .07 level, and that 
between Marginals and Nonmembers at <.001 
level of confidence). It is probable, therefore, 
that the unanticipated additional force to 
conform acting upon nonaccepted Ss derived 
from their greater need for social reality 
created by the experimental induction of 
nonacceptance. Whether a person who is not 
accepted as a member of a group for reasons 
unrelated to his ability to perform a particular 
task will also have similar forces acting upon 
him to conform is a question that needs to be 
answered. 

An alternative explanation to the one offered 


above is suggested by our experiences in 
running this’ experiment, and certain tend- 
encies in the data too weak to be discussed 
here. Our observations led us to believe that 
nonaccepted persons were experiencing con- 
siderable anxiety. They were greatly relieved, 
almost without exception, when informed in 
the postexperimental interview about the 
deception that led to their exclusion from the 
group. A closer examination of the psycho- 
logical situation of the person who is not 
accepted as a member of a desirable group 
indicates that the experience may be quite 
disturbing, since the person’s ego-processes 
are under threat. Under such circumstances, 
the person may have what might be called a 
need for social reassurance, created by his 
exclusion® from the group on invidious grounds. 
Such a need might lead the person to defend 
against rejection by identifying with the mem- 
bers of the group and conforming closely to 
their opinions and judgments. 
SUMMARY AND CONCLUSIONS 

An experiment was designed to test hy- 
potheses derived from assuming distinctive 
processes called social reality and group locomo- 
tion, each of which generates forces to conform 
under specitied conditions. Four types of 


person-group relations were created by experi- 


mentally varying subjects’ attraction to the 
group and acceptance as a member. A modifi- 
cation of Asch’s line problem and experimental 
situation was used to test the conformity 
behavior of subjects. 

The results are in accord with hypotheses 
advanced about the conformity behavior of 
highly accepted persons, thus supplying evi- 
dence in favor of the assumptions about social 
reality and group locomotion processes. An 
analysis of communication patterns within 
the group, and of subjects’ feeling about 
wanting to answer the same as other persons 
in the group, also support the conclusion that 
in a goal-oriented group of interdependent 
persons working upon a relevant task, forces 
to conform are induced and perceived by all 
persons accepted as members. 

A number of hypotheses, derived from the 


8 A re-examination of our operations in this experi- 
ment leads to the conclusion that we created megative 
acceptance rather than nonacceptance, as intended, since 
Ss were not just located outside the boundary of the 
group but were barred from membership. This distinc- 
tion is discussed more fully elsewhere (8). 
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same theoretical assumptions, were also ad- 
vanced about the conformity behavior of non- 
accepted persons. These hypotheses failed to 
predict correctly, since the conformity be- 
havior of these subjects was uniformly higher 
than anticipated. It is likely that the experi- 
mental manipulation of nonacceptance created 
an enhanced need for social reality in excluded 
subjects: they perceived their own ability on 
the experimental task to be relatively low. 
These additional social reality forces to con- 
form may have been responsible for the un- 
predicted high level of conformity of non- 
accepted subjects. 

An alternative explanation is also proposed 
for the high level of conformity behavior of 
nonaccepted persons: that they were really 
rejected rather than not accepted, and that this 
created in them a need for social reassurance. 
It is suggested that in response to this need, 
they had forces acting upon them to regain 
membership in the group by identifying with 
its members and conforming to their judg- 
ments. 
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VARIABILITY OF LIGHT PERCEPTION THRESHOLDS IN 
BRAIN-INJURED CHILDREN! 
HENRY J. MARK 
Johns Hopkins University, Department of Otolaryngology 
anp BENJAMIN PASAMANICK 
Ohio State University, Department of Psychiatry 


HIS paper reports some differences in 
light perception threshold measurements 
found between brain-injured (pyram- 
idal tract damage) and control children. The 
results were obtained in the course of a larger 
study aimed at discovering neuropsycho- 
logical measurements which might be used to 
detect otherwise unnoticed alterations in cere- 
bral functioning. Attention was directed at 
intra-individual threshold variability, rather 
than the average threshold level alone, and the 
brain-injured children were expected to have 
more “diffuse” or variable thresholds than the 
normal controls. Increased intra-individual 
threshold variability was hypothesized as a 
“fundamental disturbance” in Goldstein’s 
sense of that term (2), i.e., it was considered 
that the effect of a localized lesion may be 
detectable in many cortical areas, thus directly 
or indirectly involving not only the primary 
sensory and motor projection areas but also 
those areas assumed to mediate more complex 
cortical functions. 

Intra-individual response variability has 
been of general interest in psychology in such 
different areas as test construction and psycho- 
physical threshold measurements. In clinical 
psychology, intra-individual variability among 
the subtests of intelligence scales is of diag- 
nostic interest and is often referred to as 
“scatter of function,” representing an S’s 
ability level in different intellectual areas of 
functioning. In clinical audiology, interest now 
is focused on repeated threshold determina- 
tions, increased threshold variability within 
the auditory sense modality being considered 


1 This research was supported by an Alfred P. Sloan 
Foundation, Inc., grant to the Division of Laryngology 
and Otology, Department of Surgery, Johns Hopkins 
University, as well as by a grant from the Foundation 
for Mentally Retarded and Handicapped Children of 
Baltimore. The authors wish to thank Paul Meier for 
invaluable assistance and advice, and Victor Laties and 
Bernard Weiss for their critical reading of the manu- 
script. 


an important factor in distinguishing central 
hearing or language disorders from end-organ 
impairment.? This report restricts itself to 
the results obtained in light perception thresh- 
olds. Studies with repeated critical flicker 
fusion threshold measurements (3) and other 
visual phenomena (4), as well as with thresh- 
old measurements in other sense modalities 
(5) in the same Ss as those used here, report 
similar findings of increased intra-individual 
variability in the brain-injured. One may 
therefore hypothesize that organic dysfunction 
readily manifests itself in lack of consistency 
of one sort or another; and, indeed, it is easy 
to conceptualize how increased variability in 
so-called simpler primary functions may ac- 
tually give rise to dysfunction in processes 
such as conditioning and memory as well as 
more complex behavior patterns. 

In addition to variability of threshold, com- 
parison between the brain-injured Ss and 
controls in the magnitude of the thresholds (i.e., 
exposure time required) is of interest in view 
of the findings in the CFF experiments with 
the same groups that only under a few condi- 
tions were there lowered CFF thresholds in 
the brain-injured. It was hoped that the effects 
of a presumably distant cortical lesion on 
visual efficiency would be more readily detect- 
able in the light perception thresholds than in 
the CFF thresholds of the brain-injured 
children tested. 

The threshold point is conventionally taken 
as that stimulus magnitude for which the 
probability of a response (“light seen”) is 50 
per cent. In these experiments, the estimated 
threshold is always the mean of ten threshold 
measurements. By hypothesis, higher thresh- 
olds (longer exposure time), corresponding to 
loss of visual efficiency, should be associated 
with the brain-injured. The estimated spread 
or variability is reported as the V score, and 


* Hardy, W. G. Personal Communication, May 
1957. 
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is always taken to be the logarithm* of the 
variance of the distribution of measurements 
around the mean, i.e., around the threshoids. 
Accordingly, brain-injured Ss should show 
larger V scores (greater intra-individual vari- 
ability). 

Measurement of spread is complicated by 
the fact that the observed S-shaped psycho- 
metric curve consists of what is traditionally 
called a sensory or “true” component plus a 
measurement error. If this measurement error 
is large relative to the difference in spreads 
between two Ss, then the resultant large 
spread due to measurement error will obscure 
any difference in spread between two Ss 
which may in fact exist. An attempt was 
therefore made to refine apparatus and tech- 
nique in order to reduce inherent measurement 
error to a magnitude where it would not 
interfere with distinguishing abnormal spreads. 


METHOD 


Subjects 


The brain-injured group was composed of ten chil- 
dren (six males, four females) with diagnosed pyramidal 
tract involvement and with no primary diagnosed 
ophthalmologic pathology. The control group was com- 
posed of ten children (five males, five females) with 
diagnosed conditions which did not jastify an assump- 
tion of brain damage or injury, such as Erb’s palsy, 
osteochondritis, or tuberculosis of the spine. All Ss 
attended a Baltimore school for handicapped children, 





* The V scores are given as 4 + logyos* (rather than 
s* or s) for the following reason. If we suppose that the 
psychometric function can be described as a cumulated 
normal distribution, we know that the distribution of 
s* is that of a constant times a mean square (chi square/ 
degrees of freedom) distribution, i.e., s* is distributed as 
o*(x3/df), where o* is the variance of the psychometric 
curve. Consequently, the distribution of V = log s* is 
that of a constant plus the logarithm of a mean square, 
i.e., V = logs* is distributed as log o* + log (x?/df). 
Since o? is a parameter of the curve, the precision of V 
depends only on the (known) distribution of the second 
term. Thus, since 10 measurements were made for each 
threshold, the V scores all have equal precision, and the 
variance of a single V score is 0.0469 (1). An approxi- 
mate test for equality of a set of k variances is given by 
calculating the ratio of [(V; — V)*#/(& — 1) to 0.0469 
and comparing the result with the appropriate per- 
centage point of the F4_;,..) distribution. 

It must be emphasized that it is the heterogeneity 
rather than the absolute magnitude of the individual 
o* values which affects the value of 2(V; — V)*/(k — 1). 
If individual differences in spread are large compared 
with measurement error, the mean square will be large. 
However, if measurement error is increased so that the 
differences in spread are obscured, the mean square 
will tend to be reduced and the differences in spread 
may not be detected. 


The groups were roughly comparable in terms of age, 
IQ, and sex, and the effects of these covariates on all 
significant discriminators were examined by means of 
regression diagrams. None was significantly related to 
performance. For example, there was no apparent re- 
gression of thresholds or V scores on IQ (which ranged 
between 64 and 121), and it was therefore assumed that 
IQ did not enter as an important experimental factor in 
the difference between brain-injured and control Ss. 
The mean IQ of the brain-injured group was 89 with an 
SD of 16; the mean for the control group was 96 with 
an SD of 16. The age zange was between 10 and 15 
years with a mean of 12 for both groups. 


Apparatus 

The apparatus consisted of two Sylvania R 1131 C 
glow-modulator tubes mounted centrally on a perimeter 
so that two focused lines of light, each 34 mm wide and 
6 mm long and separated by a distance of 4 mm could 
be projected by means of a simple achromatic lens onto 
a piece of white plastic mounted on black velvet cloth. 
The glow modulator was activated by an electronic 
pulse-type stimulator and timer (6) which permitted 
continuous and independent variation of exposure 
times, t; and te, of the two light stimuli, as well as con- 
tinuous and independent variation of the pause (p) 
between the onset of the two light stimuli. The two 
rectangular focused lines of light reflected by the white 
plastic were viewed by the Ss from a distance of 66 
centimeters. In both parts of this experiment, the ex- 
posure time of the left light (t,) was the only experi- 
mental variable and is reported in milliseconds (ms). 
The exposure time of the right light (tz) was held con- 
stant at 135 ms throughout. In Part I (absolute light 
thresholds), the pause between the onset of the two 
lights was zero (i.e., they were presented simultane- 
ously); in Part II (apparent movement thresholds), the 
pause between onset of the two lights was held constant 
at 150 ms, which, under these conditions, invariably 
gave rise to an apparent movement phenomenon. 
Perception of such movement and reported cessation of 
movement was thus considered evidence of perception 
of the left stimulus light. The frequency of presentation 
of the two lights was electronically timed at one per 
second throughout both parts. The experiments were 
carried on in total darkness. With constant exposure, 
the reflected lights had a brightness of the order of one- 
tenth millilamberts (0.1 ml) as measured with a Mac- 
Beth Illuminometer. (In order to have an adequate 
range for threshold measurements o* t;, the brightness 
of the left light was actually somewhat less than that.) 


Procedure 


For Part I (absolute light thresholds), two stimulus 
lights were presented simultaneously once per second. 
The exposure time (t;) of the dimmer (slightly reddish) 
light was varied continuously from .2 ms to 9.3'ms. The 
first measurement was the descending threshold (Lyese) 
where t; was continuously decreased. S reported when 
he was no longer able to see the left light (but of course 
continued to see the right light). Then an ascending 
measurement (Lise) was obtained where t; was con- 
tinuously increased. S reported when he again saw the 
left light next to the right light. This procedure was 
repeated ten times. In this way two thresholds and two 
V scores (Lpesc and Las) were obtained for every S, 
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with each threshold based on 10 measurements and each 
V score therefore associated with 9 degrees of freedom. 
Before these measurements were made, the Ss were 
given practice in making these reports. 

The same procedure was followed for Part IT (ap- 
parent movement thresholds), but the lights were pre- 
sented sequentially with a pause of 150 ms between the 
onset times of the two lights. (A pilot investigation with 
children outside the study indicated that such a pause 
always produced a movement effect under these physical 
conditions.) Continuous decrease of t; had no other 
obvious effect than to stop the “movement,” leaving 
the line blinking on and off in the same place once per 
second. An increase of t; again produced the movement 
effect. In this part of the experiment it was more con- 
venient to have an ascending measurement (Mase) 
always precede a descending measurement (Mpese), 
signifying the beginning and cessation of the movement 
effect respectively. The two thresholds obtained were 
again based on 10 measurements each, and the cor- 
responding V scores are therefore again associated with 
9 degrees of freedom. 

The instructions, aimed at minimizing differences be- 
tween ascending and descending thresholds follow: 

For Lpe.e: You now see two lines,‘ one brighter 
than the other. You will notice that the bright light 
will always be there, and will not change. The dim 
light is a little red and is the one to watch carefully. 
The dim light will get dimmer and dimmer so that 
you can hardly see it any more until it is all gone. 
Let me know as soon as you notice the dim light is 
not there any more. Let me know as soon as you 
notice only the bright line without the dim light next 
to it. 

For L4,¢: Now you see only the bright line. Let 
me know as soon as you notice that there is a dim 
light next to it again. As soon as you notice that the 
bright line is not just blinking on and off alone, but 
that there’s a dim light next to it again. 

For M,4,-: Now you see one line that is standing 
still and just going on and off. Let me know as soon 
as you notice that the line is moving across a little 
bit—as soon as you notice it is sot standing still but 
moving across from left to right even a little bit. 

For Mp..-: Now you see one line that is moving 
across from left to right. Let me know as soon as 
you see that it’s mot moving across any more, that it 
is standing still; that it is just blinking on and off, 
but not moving across any more. 

Nonparametric rank methods (7) were indicated in 
view of the sometimes erratic estimates of spread (V 
scores) of the psychometric functions. The .05 level of 
significance was adopted as the criterion for rejecting 
the two-tailed null hypothesis of no difference between 


groups. 
RESULTS AND DISCUSSION 


Threshold results are shown in Table 1. 
Higher average thresholds (longer exposure 


¢ At near threshold values, the dim line did not look 
so much like a line but just more like a thin strip of 
light. In order to obtain the lowest possible thresholds, 
Ss were trained to respond to any light in addition to 
the bright line. This actually was only an issue during 
the training period with three Ss and was readily 
clarified. 


TABLE 1 
AVERAGE Licut PERCEPTION THRESHOLDS (Loiese 
AND Lase) AND APPARENT MOVEMENT THRESHOLDS 
(Moiese AND Mas.) IN MILLISECONDS FOR THE 
Braty-INJURED AND CONTROL GROUPS 
or Ten Ss Eacu 








Control 


SD 


Brain-Injured 
me = 


Thresholds 








Absolute Light Perception 


Lpese 2.9 1.0 1. 
Lase 2.8 8 1. 


Apparent Movement Perception 
1.8 5 1.3 
2.4 1.0 1.6 


Mpese 
Mase 





* Significant at the .05 level of confidence. 
** Significant at the .01 level of confidence. 


TABLE 2 
AVERAGE LIGHT PERCEPTION (Lypess AND Lise) AND 
APPARENT MOVEMENT (Mpese AND Masc) V SCORES 
(Loc Variance Scores +4) For THE BRAIN- 
INJURED AND CONTROL GROUPS OF 
Ten Ss Eacu 








Brain-Injured Controls 


x Fe Xx Fs 


Absolute Light Perception 


-Lpese 3.60*** 1.80 2. 
Lase 3.@0°°* 5.11 2.81 


Apparent Movement Perception 
Mpoese S.de 0 2.83 
Mase 3.558°°* 6.33 


*** Signi tly different from control group at 
the .001 level of confidence. 

® Fo. = variance of V scores/.0469. If in fact 
a8 Sy waate 's Geom had Oe au Geant, ae oe 
F ratio of 1.88 would occur only 1 time in 20. See 
text and Footnote 3. 


Thresholds 








80 91 
3.08 


1.33 
3.75 


3.11 
2.75 





time) were found in the brain-injured than in 
the control Ss. This supports the hypothesis 
that visual efficiency is impaired in the brain- 
injured. The difference between the groups is 
significant both for the absolute light percep- 
tion thresholds (p < .02) and for apparent 
movement thresholds (p < .05). Since CFF 
thresholds did not discriminate as sharply 
between these groups (3), light perception 
thresholds may be more sensitive to a presum- 
ably nonoccipital lesion than CFF thresholds. 

The average V scores of the brain-injured 
group were significantly larger than those of 
the control group (Table 2). This confirms the 
major hypothesis that the brain-injured show 
increased intra-individual variability. The dif- 
ference between the groups is significant for 
the ascending and descending absolute light 
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perception thresholds as well as for the as- 
cending and descending apparent movement 
thresholds (p < .001). 

The large difference in average V scores 
between the two groups suggests that this 
measure may be useful for diagnosing brain 
injury. In addition, for those tasks in which 
significant F ratios were obtained (as in the 
ascending measures of both groups), the tech- 
nique may be sensitive enough to reflect indi- 
vidual differences within a group. (Significant 
F ratios, as described in Footnote 3, indicate 
that the interindividual variability within a 
group is greater than would be expected by 
chance.) Thus, from the viewpoint of diag- 
nostic potential, the ascending V scores suggest 
the possibility that individuals within dichoto- 
mized groups may be ranked on a variability 
scale, which may then be standardized and 
correlated with other measures or clinical 
categories. 

In a larger study (5), further experiments 
were conducted to determine whether differ- 
ences found in the above groups could be 
reproduced with other groups. By lowering 
the age limit to eight years and making some 
changes in procedure, seven additional brain- 
injured Ss could be tested.® Significant differ- 
ences in thresholds and V scores between this 
younger group of seven brain-injured Ss and 
seven comparable control Ss were found, thus 
supporting the findings with the older groups 
reported here. 

It is also of interest to note that the light 
exposure time necessary to produce the appar- 
ent movement effect was considerably shorter 
(an average of .6 ms) than the exposure time 


5 In order to keep the groups comparable in terms of 
experiential background, educational experience, and 
immediate testing environment, children from other 
schools or outside the school system were not included 
in the study. One of the authors (Pasamanick) was con- 
sultant for handicapped children with the Baltimore 
school system. His experience suggests that except for 
two or three children in another school, Negro before 
desegregation, and a very small number of children, 
usually with lesser involvement, scattered throughout 
the city, the total of 19 brain-injured children tested 
throughout the year in the larger study constitutes 
almost the entire population of pyramidal tract 
damaged children of school age satisfying the conditions 
described and excluding those meaningfully placed into 
the “mental defective” category. 


necessary for the absolute light perception 
thresholds in 17 out of 20 Ss tested. It may be 
that under these experimental conditions, the 
apparent movement phenomenon is a more 
sensitive index to sensory excitation than the 
absolute light perception threshold. 


SUMMARY 

Magnitude and intra-individual variability 
of absolute light perception and apparent 
movement thresholds of ten children with 
pyramidal tract damage were compared with 
the threshold and variability of ten non-brain- 
injured handicapped children of comparable 
age, IQ, and sex distribution. The results show 
that the brain-injured Ss displayed signif- 
icantly higher thresholds than the control Ss, 
thus supporting the hypothesis that visual 
efficiency may be lowered demonstrably in 
brain-injured Ss with presumably nonoccipital 
lesions. Moreover, as predicted, the brain- 
injured showed consistently greater intra- 
individual variability than the control Ss. 
When viewed in light of similar studies, the 
results support the major hypothesis that 
organic dysfunction gives rise to lack of 
consistency of one sort or another. The 
diagnostic potentialities of the employed exper- 
imental procedures were discussed. 
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THE EFFECT OF SUBLIMINAL STIMULATION UPON AUTONOMIC 
AND VERBAL BEHAVIOR 


N. F. DIXON 
University College, London 


VER the last fifty years a number of 
researches (2, 8, 15, 18) have shown 
that subliminal! stimuli may exercise 

a selective function upon verbal behavior. 
A common feature of these experiments was 
that S made his guesses from a strictly limited 
set of possible responses. In this situation, 
the subliminal stimulus, whether auditory or 
visual, was found to increase the probability 
of S’s responding verbally in terms of what 
was currently presented. 

The results from an earlier research (6) by 
the writer, however, have suggested that 
subliminal stimulation may also prove influ- 
ential in a free choice situation. Asked to say 
“the first word that comes to mind” during 
subliminal auditory stimulation with words, 
Ss showed a significant tendency to produce 
autistic associations to the stimuli. Moreover, 
a significant correlation was found between 
response latencies and the emotional value of 
the stimulus items. The first of these phenom- 
ena was noted as being consistent with the 
informal observations of Bruner, McGinnies, 
and Licklider® that prerecognition hypotheses, 
in tachistoscopic experiments and word articu- 
lation tests, were often associations of an 
autistic kind to the stimulus material. The 
finding is also consistent with the experi- 
menta! conclusions of Fisher (9) and Poetzl 
(14). 

The present pilot study, some aspects of 
which have been covered in a previous paper 
(7), is concerned with investigating the possi- 
bility of these phenomena when the stimulus 
is visual. It is designed, moreover, to test the 
subception hypothesis as put forward by 
Lazarus and McCleary (14). This hypothesis, 
which postulates a heirarchy of response thresh- 
olds, derives from the finding that GSRs 
conditioned to verbal stimuli could still be 


1For the purpose of this paper, a “subliminal” 
stimulus is one of whose nature and presence § re- 
mains wholly unaware—i.e., its intensity is below the 
awareness threshold. 

? Personal communications. 


evoked below the verbal recognition threshold. 
Based on this finding, however, the subception 
hypothesis has been the object of considerable 
dispute. 

In particular, Howes (11) and more recently 
Eriksen (9) have questioned it on the grounds 
that the subception effect could be an artifact 
arising from the Lazarus and McCleary experi- 
mental design. The essence of their argument 
is that a comparison of GSRs and verbal 
responses can only be meaningful provided 
S has an equal opportunity of demonstrating 
his discriminatory capacity in the two response 
systems. In the Lazarus and McCleary experi- 
ment this condition clearly did not obtain. 
With an infinite number of possible GSR 
categories and only ten verbal response cate- 
gories, S could signal partial discrimination by 
the one response system but not by the other 
or, to quote Eriksen, “...if the number of 
verbal response categories is too few we run 
the risk of spuriously reducing the correlation 
between the verbal response and the stimuli, 
and of increasing the partial correlation be- 
tween the stimuli and the GSR.” In support 
of this argument, it may be added that at 
least two researches (3, 5) have shown that 
partial discrimination of briefly exposed verbal 
material can in fact occur and may be meas- 
ured by applying suitable operations and 
controls. 

The following experiment attempts to cir- 
cumvent these criticisms in two ways; first, 
by allowing S a potentially infinite range of 
verbal response categories and, second, by 
ensuring total unawareness of the stimulus 
presentation—a condition not necessitated by 
tachistoscopic exposure below the recognition 
threshold. 

Hypothesis 1. That the visual presentation of 
words at an intensity below the awareness 
threshold would influence the selection of 
verbal responses in a free choice situation. 

Hypothesis 2. That this selective function of 
the subliminal stimulus would result in verbal 
associations to the stimulus items. 
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Hypothesis 3. That subjects presented subse- 
quently with their responses would be able 
to say to which stimulus word each of these 
had ween given. 

Hypothesis 4. That galvanic skin responses 
and verbal response latencies recorded during 
stimulation with emotionally charged words 
would exceed those recorded for neutral items. 


METHOD 
Subjects 


Seven Ss took part, of whom five were male. 
All the Ss were undergraduates, four being 
students of psychology. 


Apparatus and Design 


The design of the experiment involved four 
stages: 

1. Determination of the difference thresh- 
old for brightness. 

2. Subliminal stimulation at a controlled 
intensity and for particular periods of time, 
with presentation at regular intervals of a 
supraliminal signal to respond. During this 
stage, verbal responses, GSRs, and the re- 
sponse latencies were recorded. 

3. A word association test in which imme- 
diate associations were obtain«d to the verbal 
responses given at Stage 2. 

4. A matching test in which Ss were re- 
quired to say to which of the stimulus items 
each of their responses had been given. 

For threshold determination and subsequent 
subliminal stimulation, the stimulus material, 
on 35-mm film, was projected upon an opal 
flashed translucent screen, image brightness 
being controlled by a pair of opposed neutral 
variable filters with a continuous density 
gradient of .1 per inch. Exposure of the stimu- 
lus material was by means of a shutter. The 
response signal was given by the momentary 
appearance, in the upper half of the screen, of 
a small spot of light. This signal was timed to 
occur two seconds after the shutter opened. 

S’s verbal responses closed the shutter after 
each exposure by operating a voice key. Reac- 
tion times between the appearance of the 
signal light and the verbal responses were 
recorded on a chronotron. GSRs following 
each exposure were measured on a Pye Scalamp 
galvanometer in a conventional bridge circuit, 
electrical contact with the skin being mediated 
by a pair of nonpolarizing silver silver-chloride 
electrodes each immersed in a finger beaker 
containing saline solution. 





The stimulus material consisted of 12 items, 
10 words and 2 straight lines (one being verti- 
cal, the other horizontal). Six of the words were 
of primary emotional interest, the other four 
being ‘emotionally neutral.’ The items were 
presented in the following order: this, [hori- 
zontal line], penis, barn, father, {vertical line}, 
vagina, seven, male sex organ, line, mother, 
female sex organ. 


Procedure 


The S was seated in a dark room one meter 
from and facing the translucent screen illumi- 
nated to present a uniform brightness of .3 log 
ft lamberts. A horizontal line of light, sub- 
tending a visual angle of 2°48’, was projected 
onto the center of the screen at such an inten- 
sity as to be clearly visible. S was instructed: 


I am going to make the line gradually dimmer. All you 
have to do is to watch it carefully and say “stop” the 
moment it has completely disappeared. Don’t say 
“stop” until you are quite certain you cannot see any- 
thing at all upon the screen. 


The brightness of the image was reduced at the 
rate of .1 log ft lamberts per second until 
such time as S gave his verdict. This procedure 
was repeated with the line in both vertical and 
horizontal planes until the lowest threshold 
value had been found, the criterion for “lowest 
value” being one not equalled or bettered in 
three subsequent trials. 

The first and third fingers of S’s right hand 
were then immersed respectively in the two 
electrode beakers, the arm being supported on 
a padded rest with the hand lightly clamped. 
S’s basic skin resistance was balanced against 
the bridge circuit and measured by substitu- 
tion. 

The following instructions were then given: 


I am going te project words on the screen but they will 
be too dim for you to see. With each projection how- 
ever I want you to make a guess regarding the identity 
of the word shown. Your signal that a word is there and 
that you should respond will be a small spot of light 
appearing momentarily at the top of the screen. Dur- 
ing the experiment keep watching the screen though 
you can see nothing on it. Try and maintain a relaxed 
passive attitude just letting each guess be the first 
word that comes to mind after the signal light appears. 


A practice trial of ten responses was given, 
with the shutter closed. 

This variable wedge was then set to give an 
image brightness .3 log ft lamberts below the 
determined threshold and the first stimulus 
item exposed. Two seconds later, the signal 
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light was presented and the chronotron started. 
S’s verbal response by activating the voice 
key stopped the chronotron and closed the 
shutter. The verbal response and response 
latency were recorded by £. A record was also 
taken of this observed maximum deflection of 
the GSR galvanometer (as indicated by a hand 
follower). Five seconds after the skin resistance 
had returned to the baseline, the next stimulus 
item was presented and the procedure repeated 
as before. 

After each response, S was asked if he had 
seen anything. (Jn no case did any of the Ss 
report awareness of the stimulus.) The entire 
series of 12 items was presented, in all, four 
times to each S, the order of presentation being 
the same each time, with each trial following 
immediately upon the preceding one (i.e., S 
was given no clue to the fact that a series was 
being repeated). Immediately following this 
part of the experiment, S was asked for his 
first association to each of the responses he 
had given. 

For the third stage in the experiment, carried 
out a week later, S was given a card upon which 
were printed in a random order the 12 items 
of stimulus material that had been presented 
subliminally. S was asked whether the list 
seemed in any way familiar. In every case the 
reply was to the contrary. The following 
instructions were given: 


I am going to read you a list of words [S’s original 
responses randomly ordered]. After each I want you 
to tell me the number of that item on your list which 
in your opinion is, most appropriate to the word I 
have given. I shall not ask you for the reasons for your 
choice which should be entirely personal and not bound 
by logical or commonsense considerations; just choose 
the item which you feel, for any reason whatever, is 
most suited to the word I have given. Take as much 
time as you like, then give me the number of your 
choice. 


This entire procedure was carried out for 
all seven Ss. 


RESULTS 
Response Latencies 


The first hypothesis tested was that response 
latencies recorded after presentation of emo- 
tional items would exceed those recorded after 
neutral items. An analysis of the data revealed 
that five out of the seven Ss who took pcrt 
showed shorter average latencies for the 
neutral items than they did for the emotional 


TABLE 1 
Gatvanic Skin RESPONSES 








Mean GSRs Means of Maxiaum 





“Emotional” “Neutral” “Emotional” “Neutral” 
Items Items Items 








SIGNIFICANCE OF CorRECT MATCHES 








Chance Observed 
Variance Expectation Correct 


Subject* 





3.69 
3.59 
3.61 
3.55 
2.87 
3.58 
2.72 


* The prefix M or F denotes the sex of S; the suffix 9 indicates 
that S was a student of psychology. 





items. The differences were not statistically 
significant, however, by the sign test. 


Galvanic Skin Responses 


The hypothesis tested was that GSRs ac- 
companying the presentation of emotional 
items would exceed those recorded for neutral 
items. The results showed that all seven Ss 
gave higher average GSRs for emotional items 
than they did for neutral ones (see Table 1). 
By the sign test this result in the direction 
predicted is significant (p = .008). 


Verbal Responses 


The first hypothesis tested was that when 
presented subsequently with their responses, 
Ss would be able to say to which stimulus word 
each had been given. For the purpose of this 
test, a statistical method developed by Stevens 
(17) was adopted, the advantage of this test 
being that it takes into account the total 
variance of all responses made. 

From Table 2 it can be seen that of the seven 
Ss who participated in this experiment, five 
matched their responses against the correct 
stimulus item more often than would be pre- 
dicted on a chance basis, two of them by an 
amount that is statistically significant. The 
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TABLE 3 
RESPONSES CORRECTLY MatTcHEeD TO StrmuLtus Worps 








Original Response by Each Subject 





Stimulus Word* 




















MM | MT | MTp MDp FB | FSp | MBp 
Penis | | Run, shower 
Male Sex Organ Brush Cheroot 
Female Sex Organ | Supply | Follow 
Vagina | Shade Mouth | Focus, 
triumph 
Father | Bulbous, ear Face, | 
swim, 
food | 
Mother | Screen, Silver, sugar | Spot, | Species 
| pleasant | beat 
Barn | Cartwheel | Cradle | Star, open, | Pony | Purple 
| w | | 
Line | Valparaiso | Sidelong, | Circle | Flight | Light | 
| do 
Seven | Depth | Crowd | Book | 
This | Face 








* No correct matches occurred for responses to the vertical or the horizonta!) line. 








TABLE 4 
Responses Matcuep To Synonymous StTmuivs 
/ORDS 
Original 
Stimulus Synonymous 
Word Match Responses 
Penis M.S.0. Million, Moore, deny, chair 
M.S.0O. Penis thick, eject, figure, line, 
red chalk 
Vagina F.S.O. nostril, ready, glorify 
Vagina bloodshot, noon 





other two Ss scored slightly below the chance 
expectation. 

It is a convenient feature of the statistical 
method used that both the deviations from 
chance and the variances are additive over a 
number of Ss, thus providing for an over-all 
test of significance. By this method, a standard 
normal deviate of 2.57 is obtained, significant 
(p .005) on a one-tail test. Table 3 shows the 
actual responses that each S succeeded in 
matching correctly. 


The Matching of Responses to Synonymous 
Stimulus Words 


It will be recalled that the stimulus material 
included two pairs of synonymous stimulus 
items: “Penis” and “Male Sex Organ,” and 
“Vagina” and “Female Sex Organ.” These 
were included in order to test the hypothesis 
that the response to the subliminal stimulus 
material is a response to the meaning which 


the latter connotes, the prediction being that 
there would be a significant tendency for Ss 
to match their responses against the synonym 
of the item which had originally evoked it. 
These synonymously matched responses are 
shown in Table 4. Applying the Stevens test 
to these matches yields a significant value for 
Z of 2.408 (p = .008). 

An interesting post hoc observation was that 
no correct matches occurred for the 54 re- 
sponses given during subliminal stimulation 
with the vertical and horizontal lines. This 
effect was not predicted though it does perhaps 
fit in with a conclusion drawn by the writer 
from an earlier research that the subliminal 
stimulus influences verbal behavior by the 
facilitation of long standing associations be- 
tween objects and the verbal responses which 
have accompanied these objects in the past of 
the individual concerned. 


The Nature of the Verbal Responses 


Having justified the assumption of a causal 
relationship between subliminal stimulus and 
verbal response by the preceding analyses of 
GSRs and response matches, the extent to 
which these matches support the hypothesis 
that the subliminal stimulus evokes verbal 
associations was further examined in the light 
of the actual responses given to each of the 
stimulus items. 

The results from this study which are re- 
ferred to in an earlier paper (7) revealed, first, 
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that many of the responses were likely asso- 
ciations to the stimulus words and, second, 
that there was a tendency to guess words that 
were symbolic (in the Freudian sense) of the 
objects connoted by the stimulus word. These 
qualitative interpretations were supported to 
some extent by the S’s matchings and to some 
extent by the associations given to the re- 
sponses. Thus, to the subliminal stimulus word 
“Line” the Subject MM responded “Val- 
paraiso.” His subsequent association to “Val- 
paraiso” was “Liner.” 


Introspective Data 


For investigations into the effect upon 
behavior of subliminal stimulation, introspec- 
tive data are obviously of prime importance in 
establishing whether or not the stimulation was 
in fact subliminal. In this experiment, three 
appeals were made to introspection for evi- 
dence on this point: (a) in the original thresh- 
old determination; (6) in a general instruction 
prior to subliminal stimulation (the instruc- 
tion was to report immediately during the 
course of the experiment any awareness of 
anything on the screen other than the periodic 
signal light); and (c) at the end of the experi- 
ment when each S was asked whether at any 
time he had seen anything at all upon the 
screen. The question of threshold determina- 
tion has already been dealt with. Regarding 
the possible reporting of awareness during 
the experiment, none of the Ss reported seeing 
anything. Finally, when questioned after the 
experiment, the Ss were unanimous in re- 
porting that they had seen nothing and that 
their responses had in fact all been pure 
guesses. 

Such other introspective data as the Ss were 
able to provide are shown below: 


MM _ Words just came up when I saw the signal. I 

had no imagery. 

I had visual images of wheat, faces, and birds. 

[These objects figured amongst his responses.] 

MTp I had many extraneous thoughts. “River” 
came up twice. I thought of words in an en- 
cyclopedia, also of words already given. 

FB Nothing to report. 

FSp __ Slightly fatigued at end. Apprehensive at be- 
ginning about what I might say. Frequently 
thought of responses amongst which five 
tabu words came up. 

MBp Nothing else to report. 


These introspections reveal little beyond the 


MT 


fact that the Ss seemed to have little insight 
into the origin of their responses. The comment 
that “words just came up” suggests a some- 
what passive arrival in consciousness of the 
response to make, rather than an active process 
of guessing or selecting, the appropriate 
response. 


DISCUSSION 


The subception hypothesis which postulates 
a heirarchy of response thresholds has met 
with resistance on two counts. On the one 
hand it is regarded as uneconomical, on the 
other as inviting a homuncular theory of 
perception, particularly when related to the 
concept of perceptual defense. To avoid what 
Allport (1) has called the anthropomorphic 
notion of some inner perceiving agent, other 
explanations have been given of the relevant 
phenomena as, for example, “voluntary sup- 
pression” in the case of apparent perceptual 
defense and “a statistical artifact” in the 
case of the Lazarus and McLeary findings. 

The results from the present experiment can 
not, however, be so easily explained by these 
means. At the time of threshold determina- 
tion, the Ss had nothing to suppress and 
thenceforth were being stimulated well below 
this threshold. In the second place, explanation 
of the differential GSR behavior in terms of 
partial discrimination at a conscious verbal 
level seems unlikely for two reasons. In the 
first place, although S had a virtually infinite 
range of verbal response categories his re- 
sponses bore no structural similarity whatever 
to the stimulus items. In the second place, 
the introspective data confirmed that the Ss 
were completely unaware of the stimulus 
presentations. 

In this connection it should be recalled that 
this experiment differed from those of Lazarus 
and McCleary on the one hand, and Chapanis 
and Bricker on the other, inasmuch as stimula- 
tion was applied not merely below the recogni- 
tion threshold but below what for convenience’ 
sake we may describe as the absolute awareness 
threshold. Ss knew that they were being 
stimulated only because they were told so by 
E, net because they were aware of something 
in the visual field. 

This raises a problem crucial to the whole 
concept of subception. How, it may be asked, 
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can we assume that there was no discrimination 
at a verbal level while at the same time as- 
serting that the verbal responses were asso- 
ciations to the stimulus items? In the writer’s 
opinion, the apparent paradox stems from a 
confusion that has permeated much recent 
discussion of the topic. It is the confusion 
between three quite distinct orders of re- 
sponse—awareness of the stimulus, verbaliza- 
tion about this awareness, and verbal behavior 
determined by, but subjectively unrelated to, 
the stimulus. The confusion arises no doubt 
from the mistaken parallel that is sometimes 
drawn between repression and certain percep- 
tual processes. Because repression is sometimes 
regarded as the inability to verbalize ideas 
and feelings it does not necessarily follow that 
the ability to verbalize can be taken as a 
criterion of perceptual awareness. 

If ability to verbalize could be so used, then, 
as one psychologist (13) has recently remarked, 
young children and aphasics might well be 
useful subjects for the demonstration of 
subception merely because they would in all 
probability react emotionally to stimuli about 
which they could not verbalize. The sugges- 
tion here is that not only are awareness and 
verbal thresholds far from synonymous but 
that the relationship between them is consider- 
ably more complex than some writers would 
have us suppose. Consider, for example, some 
of the possible relations that may be observed: 

1. S is wholely unaware of the stimulus and 
verbalizes, “I can see nothing, there is nothing 
there.” No correlation can be found between 
his other verbal responses and the stimulus. 

2. S is unaware of the nature of the stimulus 
but aware of the fact that he is being stim- 
ulated. He can detect something in the visual 
field. He verbalizes, “There is something there 
but I cannot say what.” A partial correlation, 
however, is found between the structure of his 
verbal responses and that of the stimulus 
items. 

3. S is wholely unaware of the stimulus. He 
verbalizes, “I can see nothing, there is nothing 
there,” but some correlation is found, not 
between the structure, but between the 
meaning area of the stimulus material and 
that of his responses. 

This latter case is exemplified by our present 
data. In other words, resolution of the apparent 





paradox depends upon distinguishing between 
the verbal response system underlying in- 
tended verbal report and activated by stimuli 
above the awareness threshold and that of 
autistic verbal behavior activated by stimuli 
below this threshold. Two criteria may be 
applied to this distinction: first, whether or 
not S reports any awareness of the stimulus 
and, second, whether the observed relationship 
between the stimuli and responses is such as he 
was programmed to make. In the present case, 
the level of intended verbal report was clearly 
not active. S reported no awareness of the 
stimulus, and the relationship between his 
responses and the stimuli was not that which 
he had been instructed to observe. 

The evidence then suggests that with this 
admittedly small sample, partial discrimina- 
tion did occur below the level of awareness. 
The data do not require us to regard the thresh- 
old for autonomic descrimination as neces- 
sarily lower than that for determination of 
verbal behavior but merely to assume that 
both these thresholds lie below those of stimu- 
lus awareness and intended verbal report. If 
we allow that stimulus awareness and intended 
verbal report belong to one response system, 
while autonomic response or the determina- 
tion of autistic verbal behavior belong to 
another, then our data suggest that subception 
occurred. Whether the autonomic response 
precedes, is coincident with, or follows upon 
the verbal behavior remains a matter for 
conjecture. Since it was the stimulus items 
rather than the verbal responses that were 
emotionally laden, the first of these alterna- 
tives seems the more likely. 

However, while the data support the subcep- 
tion hypothesis, they do not permit us to 
assume that the differential GSRs were a 
function of an unconscious emotional disturb- 
ance. They could be due to the greater diffi- 
culty of associating to a tabu word, as recently 
demonstrated by Jacobs (11). To what extent 
the difficulty of associating is a function of the 
emotional nature as opposed to the mere 
unfamiliarity of these words remains an open 
question. That emotional value is a factor 
however is supported by the present finding 
that the GSRs recorded for “Father” and 
“Mother” exceeded those recorded for any of 
the neutral words though these were of the 
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same frequency rating. Also in favor of the 
hypothesis that the higher GSRs recorded for 
the emotional items do in fact reflect the 
emotional impact of these words is the decline 
in GSR with successive presentations. The 
occurrence of associations often of an autistic 
kind in the context of a subliminal experiment 
has been dealt with extensively elsewhere (7). 
Suffice it to say that it suggests the operation 
of an ontogenetically determined hierarchy of 
classificatory levels wherein the subliminal 
stimulus exercises a selective function upon 
responses hitherto associated with the stimulus 
material, the occurrence probabilities of partic- 
ular associations being determined by emo- 
tional as opposed to logical factors; that is, 
the subliminal stimulus activates primary 
thought processes. 

That there is, even in the waking state, a 
general tendency towards autism in cases of 
reduced sensory input quite irrespective of 
particular unconscious discrimination is inci- 
dentally suggested by the recent experiments 
of Bexton, Heron, and Scott (4). 


SUMMARY 


This pilot experiment was undertaken to 
examine the general hypothesis that the effect 
upon verbal and autonomic behavior of sub- 
liminal visual stimulation would be comparable 
to that found in the case of subliminal stimu- 
lation that was auditory. In particular, the 
design was orientated towards investigating 
the subception hypothesis of Lazarus and Mc- 
Cleary. The hypotheses tested were that: 

1. Response latencies and GSRs would be 
determined by the affective value of the stimu- 
lus material. 

2. The verbal guesses made during sublim- 
inal stimuiation would be associations to the 
stimulus items as shown by the subject’s 
ability to indicate subsequently that stimulus 
item to which each response had been given. 

In addition to these main hypotheses, the 
design allowed a test as to whether responses 
made following the presentation of particular 
items would be matched against synonymous 
items, thus indicating that the original re- 
sponse had been a response to meaning. 


For the experiment, seven subjects were 
each presented at a subliminal intensity with 
12 items of stimulus material, dichotomized 
regarding their emotional values, in four con- 
secutive trials. GSRs, response latencies, and 
verbal guesses were recorded for each presenta- 
tion. Subsequently, each subject was presented 
with his responses and asked to match these 
against the 12 stimulus items. 

1. Little evidence was found to support 
the hypothesis that response latency would be 
a function of the emotional value of the stimu- 
ius material. 

2. The GSR hypothesis, however, was sup- 
ported at a significant level of probability. 

3. That the significant relationship between 
GSRs and the stimulus material justified the 
assumption of a causal relationship between the 
subliminal stimulus and the verbal response, 
received support from the second part of the 
experiment in which it was found that the 
group as a whole were able to match their 
verbal responses against the appropriate stimu- 
lus items to an extent that was statistically 
significant. That the responses were, in fact, 
responses to meaning was further supported by 
the subjects’ ability to match them against the 
synonyms of the stimulus item by which they 
had been evoked. The results as a whole were 
considered to be supportive of the subception 
hypothesis. 
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ATTITUDES ESTABLISHED BY CLASSICAL CONDITIONING! 


ARTHUR W. STAATS anp CAROLYN K. STAATS 
Arizona State College at Tempe 


sGoop and Tannenbaum have stated, 
() “_.. The meaning of a concept is its 

location in a space defined by some 
number of factors or dimensions, and altitude 
toward a concept is its projection onto one of 
these dimensions defined as ‘evaluative’” (9, 
p. 42). Thus, attitudes evoked by concepts 
are considered part of the total meaning of 
the concepts. 

A number of psychologists, such as Cofer and 
Foley (1), Mowrer (5), and Osgood (6, 7), to 
mention a few, view meaning as a response—an 
implicit response with cue functions which 
may mediate other responses. A very similar 
analysis has been made of the concept of 
attitudes by Doob, who states, “ ‘Am attitude 
is an implicit response... which is considered 
socially significant in the individual’s society’ ” 
(2, p. 144). Doob further emphasizes the 
learned character of attitudes and states, ““The 
learning process, therefore, is crucial to an 
understanding of the behavior of attitudes” (2, 
p. 138). If attitudes are to be considered 
responses, then the learning process should be 
the same as for other responses. As an example, 
the principles of classical conditioning should 
apply to attitudes. 

The present authors (12), in three experi- 
ments, recently conditioned the evaluative, 
potency, and activity components of word 
meaning found by Osgood and Suci (8) to 
contiguously presented nonsense syllables. The 
results supported the conception that meaning 
is a response and, further, indicated that word 
meaning is composed of components which can 
be separately conditioned. 

The present study extends the original 
experiments by studying the formation of 
attitudes (evaluative meaning) to socially 
significant verbal stimuli through classic] con- 
ditioning. The socially significant verbal 
stimuli were national names and familiar 
masculine names. Both of these types of 


1 This study is part of a series of studies of verbal 
behavior being conducted by the authors at Arizona 
State College at Tempe. The project is sponsored by the 
Office of Naval Research (Contract Number NONR- 
2305 (00)), Arthur W. Staats, principal investigator. 
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stimuli, unlike nonsense syllables, would be 
expected to evoke attitudinal responses on the 
basis of the pre-experimental experience of the 
Ss. Thus, the purpose of the present study is to 
test the hypothesis that attitudes already 
elicited by socially significant verbal stimuli 
can be changed through classical conditioning, 
using other words as unconditioned stimuli. 


METHOD 
Subjects 4 
Ninety-three students in elementary psychology 


participated in the experiments as Ss to fulfill a course 
requirement. 


Procedure 


The general procedure employed was the same as in 
the previous study of the authors (12). 

Experiment I.—The procedures were administered 
to the Ss in groups. There were two groups with one half 
of the Ss in each group. Two types of stimuli were used: 
national names which were presented by slide pro- 
jection on a screen (CS words) and words which were 
presented orally by the Z (US words), with Ss required 
to repeat the word aloud immediately after Z had 
pronounced it. Ostensibly, Ss’ task was to separately 
learn the verbal stimuli simultaneously presented in the 
two different ways. 

Two tasks were first presented to train the Ss in the 
procedure and to orient them properly for the phase of 
the experiment where the hypotheses were tested. The 
first task was to learn five visually presented national 
names, each shown four times, in random order. Ss’ 
learning was tested by recall. The second task was to 
learn 33 auditorily presented words. Ss repeated each 
word aloud after Z. Ss were tested by presenting 12 
pairs of words. One of each pair was a word that had 
just been presented, and Ss were to recognize which 
one. 

The Ss were then told that the primary purpose of 
the experiment was to study “how both of these types of 
learning take place together—the effect that one has 
upon the other, and so on.” Six new national names 
were used for visual presentation: German, Swedish, 
Italian, French, Dutch, and Greek served as the CSs. 

These names were presented in random order, with 
exposures of five sec. Approximately one sec. after the 
CS name appeared on the screen, Z pronounced the US 
word with which it was paired. The intervals between 
exposures were less than one sec. Ss were told they 
could learn the visually presented names by just 
looking at them but that they should simultaneously 
concentrate on pronouncing the auditorily presented 
words aloud and to themselves, since there would be 
many of these words, each presented only once. 
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The names were each visually presented 18 times in 
random order, though never more than twice in 
succession, so that no systematic associations were 
formed between them. On each presentation, the CS 
name was paired with a different auditorily presented 
word, i.e., there were 18 conditioning trials. CS names 
were never paired with US words more than once so 
that stable associations were not formed between them. 
Thus, 108 different U’S words were used. The CS 
names, Swedish and Dutch, were always paired with /S 
words with evaluative meaning. The other four CS 
names were paired with words which had no systematic 
meaning, ¢.g., chair, with, twelve. For Group 1, Dutch 
was paired with different words which had positive 
evaluative meaning, e.g., gift, sacred, happy; and 
Swedish was paired with words which had negative 
evaluative meaning, e.g., bitter, ugly, failure. For 
Group 2, the order of Dutch and Swedish was reversed 
so that Dutch was paired with words with negative 
evaluative meaning and Swedish with positive meaning 
words. 

When the conditioning phase was completed, Ss 
were told that Z first wished to find out how many of 
the visually presented words they remembered. At the 
same time, they were told, it would be necessary to 
find out how they felt about the words since that might 
have affected how the words were learned. Each S was 
given a small booklet in which there were six pages. 
On each page was printed one of the six names and a 
semantic differential scale. The scale was the seven- 
point scale of Osgood and Suci (8), with the con- 
tinuum from pleasant to unpleasant. An example is as 
follows: 


German 
pleasant :__ :__:__:__:__:__:___: unpleasant 

The Ss were told how to mark the scale and to 
indicate at the bottom of the page whether or not the 
word was one that had been presented. 

The Ss were then tested on the auditorily presented 
words. Finally, they were asked to write down anything 
they had thought about the experiment, especially the 
purpose of it, and so on, or anything they had thought 
of during the experiment. It was explained that this 
might have affected the way they had learned. 

Experiment II.—The procedure was exactly re- 
peated with another group of Ss except for the CS 
names. The names used were Harry, Tom, Jim, Ralph, 
Bill, and Bob. Again, half of the Ss were in Group 1 and 
half in Group 2. For Group 1, Tom was paired with 
positive evaluative words and . 4! with negative 
words. For Group 2 this was reversed. The semantic 
differential booklet was also the same except for the 
CS names. 


Design 

The data for the two experiments were treated in the 
same manner. Three variables were involved in the 
2 The complete list of CS-U.S word pairs is not pre- 
sented here, but it has been deposited with the American 
Documentation Institute. Order Document No. 5463 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washington 
25, D. C., remitting in advance $1.25 for microfilm or 
$1.25 for photocopies. Make checks payable to Chief, 
Photoduplication Service, Library of Congress. 





design: conditioned meaning (pleasant and unpleasant) ; 
CS names (Dutch and Swedish, or Tom and Bill); and 
groups (1 and 2). The scores on the semantic differential 
given to each of the two CS words were analyzed in a 
2 x 2 latin square as described by Lindquist (4, p. 278) 
for his Type ITI design. 


RESULTS 


The 17 Ss who indicated they were aware of 
either of the systematic name-word relation- 
ships were excluded from the analysis. This 
was done to prevent the interpretation that 
the conditioning of attitudes depended upon 
awareness. In order to maintain a counter- 
balanced design when these Ss were excluded, 
four Ss were randomly eliminated from the 
analysis. The resulting Vs were as follows: 24 
in Experiment I and 48 in Experiment II. 

Table 1 presents the means and SDs of the 
meaning scores for Experiments I and II. 
The table itself is a representation of the 2 X 2 
design for each experiment. The pleasant 


TABLE 1 
Means AND SDs oF CONDITIONED ATTITUDE SCORES 








Names 





Swedish 
SD 


Dutch 
Group Mean 





Experi- 


ment SD Mean 





1.50 


I 
.83 -90 


2.67 94 3.42 
1 


1 F 
2 2.67 1.31 





Bill 
Mean 


4.12 
1.79 


Tom 





Experi- 
ment Group Mean 
Il 1 2.71 
2 3.42 


SD 


2.04 
1.07 


SD 


2.01 
2.55 





Note.—On the scales, pleasant is 1, unpleasant 7. 


TABLE 2 


SUMMARY OF THE RESULTS OF THE ANALYSIS OF 
VARIANCE FOR Eacu EXPERIMENT 








Exp. I 
Source 








ey | 
1 |7.52) 4.36* 
22 |1.73| 


| 
1 (7.52! 5.52* 


Between Ss 
Groups 
Error 

Within 
Conditioned 

attitude 
Names 1 | .02) 
Residual 22 |1.36) 
Total 47 











*p< 0S. 
p< 01. 
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extreme of the evaluative scale was scored 1, 
the unpleasant 7. 

The analysis of the data for both experi- 
ments is presented in Table 2. The results of 
the analysis indicate that the conditioning 
occurred in both cases. In Experiment I, the 
F for the conditioned attitudes was significant 
at better than the .05 level. In Experiment II, 
the F for the conditioned attitudes was signifi- 
cant at better than the .01 level. In both 
experiments the F for the groups variable was 
significant at the .05 level. 


DISCUSSION 


It was possible to condition the attitude 
component of the total meaning responses of 
US words to socially significant verbal stimuli, 
without Ss’ awareness. This conception is 
schematized in Fig. 1, and in so doing, the way 
the conditioning in this study was thought to 
have taken place is shown more specifically. 
The national name Dutch, in this example, is 
presented prior to the word pretty. Pretty 
elicits a meaning response. This is schematized 
in the figure as two component responses; an 
evaluative response rpy (in this example, the 
words have a positive value), and the other 
distinctive responses that characterize the 
meaning of the word, Rp. The pairing of 
Dutch and pretty results in associations between 
Dutch and rpy, and Dutch and Rp. In the fol- 
lowing presentations of Dutch and the words 
sweet and healthy, the association between 
Dutch and rpy is further strengthened. This is 
not the case with associations Rp, Rs, and Ra 





ee ee 

DUTCH Se 
~s 

~ 


— 





== ans 


~ ae ie Tpy 
HEALY eT 
By 


Fic. 1. THe CONDITIONING OF A Positive ATTITUDE. 
Tue Heaviness oF Line REPRESENTS STRENGTH 
oF ASSOCIATION 


since they occur only once and are followed by 
other associations which are inhibitory. The 
direct associations indicated in the figure 
between the name and the individual words 
would also in this way be inhibited. 

It was not thought that a rating response 
was conditioned in this procedure but rather 
an implicit attitudinal response which medi- 
ated the behavior of scoring the semantic 
differential scale. It is possible, with this con- 
ception, to interpret two studies by Razran 
(10, 11) which concern the conditioning of rat- 
ings. Razran found that ratings of ethnically 
labeled pictures of girls and sociopolitical slo- 
gans could be changed by showing these stimuli 
while Ss were consuming a free lunch and, in 
the case of the slogans, while the Ss were 
presented with unpleasant olfactory stimula- 
tion. The change in ratings could be thought 
to be due to the conditioning of an implicit 
evaluative response, an attitude, to the CSs by 
means of the lunch or the unpleasant odors. 
That is, part of the total response elicited 
by the food, for example, was conditioned to 
the pictures or slogans and became the 
mediation process which in turn elicited the 
positive rating. 

It should be stated that the results of the 
present study do not show directly that Ss’ 
behavior to the object (e.g., a person of Dutch 
nationality) has been changed. The results 
pertain to the Ss’ attitudinal response to the 
signs, the national names themselves. However, 
Kapustnik (3) has demonstrated that a re- 
sponse generalized to an object when the re- 
sponse had previously been conditioned to the 
verbal sign of the object. Osgood states, 


The aggressive reactions associated with Nazi and Jap 
on a verbal level certainly transferred to the social 
objects represented under appropriate conditions. 
Similarly, prejudicial behaviors established while read- 
ing about a member of a social class can transfer to the 
class as a whole... (7, p. 704). 


The results of this study have special rele- 
vance for an understanding of attitude forma- 
tion and change by means of verbal communi- 
cation. Using a conception of meaning as a 
mediating response, Mowrer (5) has suggested 
that a sentence is a conditioning device and 
that communication takes place when the 
meaning response which has been elicited by 
the predicate is conditioned to the subject of 
the sentence. The results of the present study 
and the previous one of the present authors 
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(12) substantiate Mowrer’s approach by sub- 
stantiating the basic theory that word meaning 
will indeed condition to contiguously pre- 
sented verbal stimuli. In the present study, 
the meaning component was evaluative, or 
attitudinal, and the CSs were socially signifi- 
cant verbal stimuli. The results suggest, there- 
fore, that attitude formation or change through 
communication takes place according to these 
principles of conditioning. As an example, the 
sentence, “Dutch people are honest,” would 
condition the positive attitude elicited by 
“honest” to “Dutch”—and presumably to any 
person called “Dutch.” If, in an individual’s 
history, many words eliciting a positive atti- 
tude were paired with “Dutch,” then a very 
positive attitude toward this nationality would 
arise. 

The reason for the group differences in each 
of the experiments is not clear. These differ- 
ences could have arisen because there were 
actual differences in the Ss composing each 
group, or in some condition of the procedure 
occurring to one of the groups. Nothing the 
authors were aware of seem to indicate this 
as the explanation, and in the previous experi- 
ments of the authors (12) there were no group 
differences. Since in a 2 x 2 latin square the 
interactions are entirely confounded with the 
main effects, the group differences could also 
have arisen as a result of the interaction of the 
other two main effects (i.e., direction of con- 
ditioning and names). 


SUMMARY 


Two experiments were conducted to test 
the hypothesis that attitude responses elicited 
by a word can be conditioned to a contigu- 
ously presented socially significant verbal 
stimulus. A name (e.g., Dutch) was presented 
18 times, each time paired with the auditory 
presentation of a different word. While these 


words were different, they all had an identical 
evaluative meaning component. In Experi- 
ment I, one national name was paired with 
positive evaluative meaning and another was 
paired with negative evaluative meaning. In 
Experiment II, familiar masculine names 
were used. In each experiment there was sig- 
nificant evidence that meaning responses had 
been conditioned to the names without Ss’ 
awareness. 


REFERENCES 


1. Corer, C. N., & Forry, J. P. Mediated generaliza- 
tion and the interpretation of verbal behavior: 
I. Prologemena. Psychol. Rev., 1942, 49, 513- 
540. 

. Doos, L. W. The behavior of attitudes. Psychol. 
Rev., 1947, 64, 135-156. 

. Kapustnik, O. P. The interrelation between direct 
conditioned stimuli and their verbal symbols. 
(Trans. from Russian title) Psychol. Absir., 
1934, 8, No. 153. 

. Lunpoguist, E. F. Design and analysis of exper- 
iments in psychology and education. Boston: 
Houghton Mifflin, 1953. 

- Mowrer, O. H. The psychologist looks at lan- 
guage. Amer. Psychologist, 1954, 9, 660-694. 

. Oscoop, C. E. The nature and measurement of 
meaning. Psychol. Bull., 1952, 49, 197-237. 

. Oscoop, C. E. Method and theory in experimental 
psychology. New York: Oxford Univer. Press, 
1953 


. Oscoop, C. E., & Suci, G. J. Factor analysis of 
meaning. J. exp. Psychol., 1955, 50, 325-338. 

. Oscoop, C. E., & Tannensaum, P. H. The 
principle of congruity in the prediction of 
attitude change. Psychol. Rev., 1955, 62, 42-55. 

. Razran, G. H. S. Conditioning away social bias 
by the luncheon technique. Psychol. Bull., 1938, 
35, 693. 

. Razran, G. H. S. Conditioned response changes in 
rating and appraising sociopolitical slogans. 
Psychol. Bull., 1940, 37, 481. 

. Staats, C. K., & Staats, A. W. Meaning estab- 
lished by classical conditioning. J. exp. Psychol., 
1957, 54, 74-80. 


Received June 12, 1957. 








CHANGES IN RELIGIOUS INTEREST: A RETEST AFTER 15 YEARS 


IRVING E. BENDER 
Dartmouth College 


BUNDANT as are follow-up studies of 
A college students, all too few have been 
concerned with a program of research 
during, as well as after, college; particularly 
is this the case in respect to research on motives 
and values. This report represents an attempt 
to capitalize on the opportunity for such re- 
search presented by the fact that in 1939-40 
an intensive psychological study (4) has been 
made of 124 seniors at Dartmouth College 
by means of tests, ratings, interviews, and 
autobiographies. Although these men, one 
quarter of the senior class, were selected for a 
variety of reasons, the sample was found to be 
representative of the whole class in terms of 
scholastic aptitude and performance (4, p. 48). 
A renewed study of these same men was 
made possible in 1955-56 through a grant from 
the American Philosophical Society; 61 men 
were interviewed and 84 were retested on the 
Study of Values. Also, a questionnaire was 
mailed to the 118 surviving subjects of the 
original study, and, by tugging and pulling, a 
final count of 112 forms were returned—a 95 
per cent triumph. 

A few general characteristics of this group 
may be mentioned. In 1956, they ranged in 
age from 36 to 42. Almost all the men are 
married, only four have never married. Five 
men have been divorced, four of whom have 
remarried. Of the married group, all but six 
have children, with an average of 2.5 per 
family. However, twelve families have four 
children and two families have five. Weekly 
church attendance is reported by 38 men, but 
24 do not go at all. The remainder go some- 
times. Well over half the men report that their 
children attend Sunday school regularly. The 
group is predominantly Protestant. Eleven 
are Catholic and nine are Jewish. More than 
half of the men continued with graduate work 
after college and 47 have advanced degrees, 
chiefly in business and administration, but 
four have the Ph.D. degree. Of the 112 men, 
76 are in business activities, 12 are teachers, 9 
lawyers, 6 physicians, 4 writers, 1 chemist, 1 
architect, 1 engineer, 1 minister, and 1 retired. 
In one of the earliest studies of Dartmouth 


College, King (7) reported that the 64 men 
in the class of 1880 were in the following voca- 
tions: 17 lawyers, 14 teachers, 11 businessmen, 
7 physicians, 7 ministers, 4 engineers, 1 journal- 
ist, and 1 chemist (two men died soon after 
graduation). The present shift in vocational 
interest is apparent. 

One of the measures that proved particu- 
larly helpful in the original study of motiva- 
tion was the Allport-Vernon Study of Values, 
1931 edition. Respecting subjective values, 
Allport writes, they are “the core of the dy- 
namics of behavior, and play so large a part in 
unifying the personality” (2, p. 427). So a plan 
was arranged to use the same edition of the 
Study of Values, and 84 men were retested 
during the years of 1955 and 1956. It is to be 
recalled that these scores are relative; a high 
score in one or more values is counterbalanced 
by a low score in one or more of the remaining 
values. There are 180 points to be divided 
among six values; the theoretical mean of each 
is 30. 

RESULTS AND DISCUSSION 


Three main findings of this study are re- 
ported. First, a general survey of the shifts in 
the six values from 1940 to 1956; secondly, the 
correlations of religious scores and other 
variables; and thirdly, an analysis of the 
religious-scoring items of the test. 


Shifts in Value Scores 


Table 1 reports the mean scores and the ¢ 
values of the differences between the 1939 and 
1940 scores on the Study of Values and the 
1955 and 1956 scores; the former scores are 
indicated as T40 and the latter as T56 scores. 

The political value—interest in power and 
influence—remains the highest value of these 
respondents, only slightly lower after the 
lapse of 15 years. The most striking change 
is the highly significant increase in the religious 
value, which has now risen from its sub- 
basement position in 1940 to become a close 
runner-up to the political value in 1955-56. 
Higher religious scores in the retest occur for 
80 per cent of the group. Of the 118 men, only 





42 IrvinG E. BENDER 


TABLE 1 

COMPARISON OF ALLPORT-VERNON Scores, 1939-40 
vs. 1955-56 

(N = 84) 


TS6 Scores 
, | t 
Mean SD | rene | 


30.29 | 5.84 | +.48 2.69°* 
29.63 | 9.18 | +.53| 1.92 
25.81 | 9.48 | +.61| 5.s8e* 
28.98 $.52| +.20| 2.34° 
32.80 | 6.89] +.41! .77 
32.37 | 8.93 | +.49]| 7.439% 


T40 Scores 





Mean SD 
—} 

| 28.30 | 7.09 
| 31.51 | 9.03 
30.67 | 8.14 
30.88 6.15 
33.44 | 6.90 
25.21 | 8.28 | 


Theoretical 
Economic 
Aesthetic 
Social 
Political 
Religious 





*p= OS ati = 1.99. 

"p= Olati = 2.64. 

® Kelly (6) reports the following data for value change among 
176 men over 20 years: 





Time 2 


Time 1 
Value 


M SD M SD 


Aesthetic 28.6 7.9 26.0 7.4 
Religious 29.0 9.1 33.8 9.1 





one is a minister. As expected, his highest 
scores are in the religious value (T40: 40; TS6: 
39). However, of the 84 men who were retested, 
22 men or 26 per cent score at least as high as 
the minister, while in 1940, only 4 per cent 
scored as high as the minister-to-be’s T40 score. 
The religious value is the highest score for 23 
of the retested group, 27 per cent of the re- 
spondents. In 1940, the religious value was 
highest for only 6 of these 84 men, 7 per cent. 

In all values except the social, T40 and T56 
scores are significantly correlated (Table 1). 
The social value has been frequently shown to 
fall short of reliability. The conclusions of the 
present study rest of course on the degree of 
reliability and validity that has been demon- 
strated for the Study of Values (5, 8, 9). 

In Table 1, differences between T40 and 
T56 mean scores of the matched groups are 
evaluated by ¢ test. Religious, aesthetic, and 
theoretical values have all changed signifi- 
cantly. Since the most arresting change is in 
the religious value, the question arises if this 
increment is reliably greater than the other 
changes; particularly, is it greater than for the 
marked drop in the aesthetic value? Since the 
difference between the change in the mean 
religious scores and that in mean aesthetic 
scores has a ¢ value of 4.8 (p < .01), the shift 
to religious interest is decisively predominant. 
Kelly (6) reports a similar finding for 176 men 





after 20 years; the most significant shifts in 
the Allport-Vernou test were the increase in 
the religious value and a decrease in the 
aesthetic value. (See footnote to Table 1.) 

The aesthetic value, “the search for form 
and harmony,” has never been salient for 
these men; at best in 1940 it occupied only a 
middle position. Perhaps during college more 
than now, slightly greater opportunities pre- 
sented themselves to enjoy “each single 
impression ... for its own sake” (2), but at 
present this attitude is at low ebb. ‘ 

The Study of Values is based on Spranger’s 
Lebensformen, where the religious man is said 
“to endow the world with meaning and 
value ...he finds something divine in every 
aspect of life” (1, p. 212-213). The religious 
value implies an immanent faith in a higher 
reality than that of everyday life, a reality 
concerned with the experience of unity. As 
we shall see, questions in the test that relate 
to this attitude have changed less than those 
which have to do with more orthodox views 
about the divine being. 


Correlates of the Religious Value 


Our second consideration deals with correla- 
tions of variables with religious value scores in 
1940 and 1956, reported in Table 2. The r of 
+.79 between 1956 scores and church attend- 
ance supports the validity of the religious 
score. 

During the investigation of motivation of 
these men in 1939-40, the men were rated on 
the following scale of Value-Energy: 


1A Inspired: Is fired with enduring zeal and is not to be 
discouraged or deflected from his course. 

1B Intensive: Has a firm faith in the importance of his 
values and is motivated to follow them out 
steadily and eagerly. 

2 Seeking: Is trying earnestly to distill values from his 
experiences which will serve to centralize his 
purposes. 

Anxious: Is motivated to compete by his desire to 
display and sometimes tries to do more than his 
potentialities warrant. 

Vague: Has no strong interests or values of his own 
but depends upon external motivation such as 
rigid requirements or fear of punishment. 

4A Drifting: Aimless in values and tending to derive his 
motivation from momentary stimulation. 

5 Abulic: Is unable to make up his mind about any 
values and almost totally lacking in motivation. 


In 1955-56, the men who were interviewed 
were rated again in Value-Energy. An r of 





CHANGES IN RELIGIOUS INTEREST 43 


TABLE 2 


CORRELATIONS OF RELIGIOUS SCORES IN THE STUDY 
OF VALUES AND OTHER VARIABLES 








T40 Scores | T56 Scores 


Variable 








Church attendance 

Children’s Church attendance 
Value-Energy rating 1955-56 
Value-Energy rating 1940 

Amer. psychol. exam. 

No. of extra-curricular activities 
Extroversion Scale A-B 

Means of prefessorial ratings 
Self-Development ratings 1956 














*~ < .05. 
>< 01. 


+.31 is found between the ratings of Value- 
Energy of 1940 and 1955-56. This correlation 
is to be interpreted as not so much a measure 
of reliability as a reflection of some degree of 
consistency on the part of those men in fol- 
lowing out their purposes from college to their 
present situations. If these ratings are valid, 
Value-Energy tends to have an enduring 
quality. The 1940 ratings of Value-Energy are 
not significantly related to the religious scores. 
But the new 1956 ratings correlate at the .05 
level with the 1956 religious scores. Nine men 
are rated lower in Value-Energy in 1956. than 
in 1940. The remaining 52 men compared with 
this small group have reliably higher scores in 
the religious value. 

In 1955 and 1956, after interviewing the 
men, they were also rated in the variable 
called Self-Development, the degree to which 
they were living out their purposes and poten- 
tialities. The scale for rating Self-Development 
grew out of a series of interview questions 
aimed at determining the respondent’s (a) goal 
attainment, whether or not he was getting 
from life what he wanted and also (0) self- 
actualization, to what extent he was using his 
capacities and inner resources as determined 
by the author’s knowledge of tests taken during 
his college career. Separate ratings on a five- 
step scale were made of each of these variables. 
Since an r of +-.72 was found between the two 
sets of ratings, they were combined and 
designated as ratings of Self-Development. 

These Self-Development ratings correlate at 
the .05 level with the 1956 religious scores. 
Since the author also rated the men in terms of 
how well he liked them, a favorable attitude 


may well have impinged upon the ratings of 
Value-Energy and those of Self-Development. 
However, the chance relationship between 
liking and religious scores tends to deny such 
an implication. The only other significant 
correlations concern the negative relationship 
between the 1940 ratings of extroversion and 
religious scores T40 and T56. 

Further light on the changes in religious 
values is cast by an examination of scores by 
different subgroups of respondents. In Table 
3, the results are reported for the men in the 
three divisions of general study, namely, the 
humanities, sciences, and social sciences. The 
increase of 9.2 points from 1940 to 1956 in 
the religious score of the 21 men who had 
been students of the humanities approaches 
significance at the .05 level. The increase of 6 
points from 1940 to 1956 scores for the 52 
social sciences men is, however, highly signif- 
icant. When the mean differences between 


TABLE 3 
CoMPARISON OF MEAN RELIGIOUS VALUE ScorRES 
(T40 vs. T56) ror Major Divisions or 
Stupy AT COLLEGE 








! | 
Tso | TS6 
Major Divisons Game ones 





Mean| SD ‘Mean SD 








25.517.5134.7|10.0 
25.419.3/30.2| 9.7 
24..5|8.0)30.5 5.7 


1 Humanities 
2 Science 
3 Social science 














"p< 0S. 
* p< 01. 


TABLE 4 


CoMPARISON OF ReLiGious VALUE Scores (T56) 
OF PRESENT OCCUPATIONAL GROUPS 











1 Executives in large 
corporations 

2 Small self-run business 

3 Family business 

4 Salesmen 

5 Salaried 

6 Independent 
sion 


rofessionals 
profes- 





# Values Between the Groups 
2 3 


2.5° -61 











44 


TABLE 5 
CoMPARISON OF MegAN RELIGIOUS VALUE ScoRES 


Irvine E. 


Rewicious Items 


BENDER 


TABLE 6 


rrom T40 tro T56 


SHOWING SIGNIFICANT CHANGE 











(T40 vs. T56) ror Reticious DENOMINATIONS 
__T40 Scores _|_TS6 Scores _ 
ae Mean N_, SD |Mean, WN | SD | 
Congregational | 24.0 | 25 | 7.7| 29.4| 20 | 
Episcopal | 23.4) 22 7.9 | 30.3) 17 
Presbyterian 24.6} 16 | 8.2/ 34.0] 12 | 


Denomination 





*~ < .0S. 
* p< 01. 


the diversions are studied, none is significant, 
either for 1940 or 1956. 

Robert Gutman of the Sociology Depart- 
ment of Dartmouth College set up occupa- 
tional groupings of the men for whom 1956 
value scores were available. The mean differ- 
ences are shown in Table 4. The outstanding 
differences occur for the self-initiated smail 
business group who show significantly higher 
TS6 scores in religious values than do the 
large corporation executives or the family 
business group. 

Table 5 gives findings according to the 
former student’s religious denomination as of 
the time he entered college. Differences in 
mean T40 and T56 scores are reported for the 
three denominations represented by more than 
ten cases, and the differences are found to be 
reliable. No interdenominational differences of 
the means for either 1940 or 1956 are reliably 
different. 

A few additional facts deserve mention. A 
comparison of 38 men who report weekly 
church attendance and the 24 who do not 
attend at all indicates that the latter group 
report reliably lower goal achievement and 
less civic participation, although religious 
value as such did not turn out to be reliably 
correlated with the latter variables. Only 17 
men of the 84 retested decreased their religious 
score. This deviant group is reliably different 
(p < .05) from the larger group in their 1956 
religious scores. They also report lower goal 
achievement, less participation in civic affairs, 
and less frequent church attendance. Of the 
six men from whom it proved impossible to 
obtain returns, the mean religious score was 
20 in 1940 compared with the total 1940 mean 
of 25. 


Item Changes 

We have seen a marked increase in mean 
religious value score for Dartmouth men over 
15 years. In what specific ways did these men 
change their religious values? Was the change 


Item No. in The Study of Values 





Part I 
6. Which of these character traits do you 


consider the most desirable: (¢) sigh 
ideals and reverence; (b) unselfishness and 
sympathy? 


. All the evidence that has been impar- 


tially accumulated goes to show that the 
universe has evolved to its present state 
in accordance with mechanistic princi- 
ples, so that there is no need to assume a 
first, cosmic purpose, or God behind it, 
(a) Yes; (6) No. 


. The aim of the churches at the present 


time should be: (a) to bring out altruistic 
and charitable tendencies, and to urge 
people to think more of the good of 
others; (6) t@ convey spiritual worship, 
and @ sense of communion with the 
highest. 


. Taking the Bible as a whole, one should 


regard it from the point of view of its 
beautiful mythology and literary style 
rather than as a spiritual revelation. 
(a) Yes; (6) No. 


Part II 
2. In your opinion, can a man who works in 


business for his living all the week best 

spend Sunday in: 

a. trying to educate himself by reading 
sericus books 

b. trying to win at golf or racing 

c. going to an orchestral concert 

d. hearing a really good sermon 


. If you lived in a small town and had more 


than enough income for your needs, 

would you prefer to: 

a. apply it productively to industrial 
development 

b. help to endow the church to which you 
belong 

c. give it to a university for the develop- 
ment of scientific research 

d. devote it to hospitals 


. Assuming that you are a man with the 


necessary ability, and that the salary for 
each of the following occupations is the 
same, would you prefer to be a: 

a. mathematician 

b. sales manager 

c. clergyman 

d. politician 


. Should one guide one’s conduct accord- 


ing to, or develop one’s chief loyalties 

toward: 

a. one’s religious faith 

b. ideals of beauty 

c. one’s business 
associates 

d. society as a whole 


organizations and 


. If you should marry (or are married) do 


you prefer a wife who: 
a. can achieve social prestige, command- 
ing admiration from others 
b. likes to stay at home and keep 
house 
. is fundamentally spiritual 
attitude toward life 
. is gifted along artistic lines 


in her 





Mean 
_Item Score 


T40 


38 





TS6 


1.3 
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based on a more mature philosophy of life or 
was it in a more theological direction? 

For all but three of the 20 religious terms in 
the Allport-Vernon scale, change is in the 
direction of religious interest; significantly 
so at the .01 level in the case of nine of the 
items (Table 6). A study of the nature of the 
significant changes points up the quality of 
the upsurge in the religious value over the 
period of 15 years. The turn in the religious 
direction seems more theological than philo- 
sophical. 

The format of the Allport-Vernon instru- 
ment results in the fact that each value is 
compared twice with each of the other values. 
A study of such comparisons shows that there 
are reliable differences in favor of religious 
interest in both instances when the religious 
value is matched with the aesthetic and with 
the social values and in one instance only 
when matched with thecretical, economic, and 
political values. 

Kelly (6) has raised the question of whether 
the increase in religious value was a function of 
the person’s maturity or a function of the 
times in which we live. To study this problem, 
the same test form was given to 66 present 
undergraduates, mostly seniors. On compar- 
ison of present undergraduates to the seniors 
of 1940 (Table 7), a reliably higher religious 
mean score is found for present undergraduates 
and a reliably lower mean economic score. 
Comparison of the mean scores of present 
undergraduates with the mean scores attained 
by the students of 1940 as retested in 1956, 
however, reveals a remarkable similarity of 
values. These data would suggest that the 


TABLE 7 
COMPARISON OF PRESENT AND FormMER DARTMOUTH 
STUDENTS WITH RESPECT TO ALLPORT-VERNON 
ScoRES 








Former Students Comparisons 





(N = 84) 
Mean* ; Mean* 
Tso T56 
(1) (2) 





28.30 
31.51 
30.67 
30.88 
| 33.44 

25.21 


Theoretical 























** = .01 at 2.00. 
* = .05 at 1.97. 
® See Table 1 for SDs. 


temper of the times in which we live influences 
the religious value more than does the maturity 
of the men. Apparently, the same need for 
religious interest exists now armong the young 
as among the older. 

From interviews with the men, it becomes 
clear that religion in one form or another has 
usually become important to them. Further- 
more, in a number of cases it has given fuller 
meaning and purpose to life. The interviews 
confirm Allport’s statement: “While religion 
certainly fortifies the individual against the 
inroads of anxiety, doubt, and despair, it 
also provides the forward intention that 
enables him at each stage of his becoming to 
relate himself meaningfully to the totality of 
being” (3, p. 96). 

Interesting, indeed, are the speculative ques- 
tions that this change in religious interest 
suggests. Is the resurgence of religion a part 
of the widespread and expanding need to 
belong, the need to conform? Or is the change 
motivated by fear, not without an undertone 
of hope? Do circumstances compel one to 
despair of reason and invoke the ancient 
answer to trust in God? 

From the interviews with more than half 
of the men either in their homes or in their 
offices, it was apparent that they are by and 
large successful socially, vocationally, and 
financially. Their families are loved and well 
provided for. However, what was patently 
sought was a perspective for something pur- 
posive beyond the material comforts of their 
existence. Certain vague anxieties were ex- 
pressed about the routine and the literalness 
of their lives. Increasingly, the return to 
religion seems to quell their fears and to 
promise hope. 


SUMMARY 


College students who were originally studied 
in 1940 were interviewed and retested in 1955 
and 1956. The results of changes in the religious 
value scores on the Allport-Vernon Study of 
Values are reported here. A significant increase 
in the religious value scores is found after the 
15-year interval. 

These religious scores correlate highly with 
church attendance and to a lesser extent with 
present ratings of Value-Energy and Self- 
Development. Dividing the respondents into 
subgroups, we find the most marked increase 
in religious value scores on the part of (a) 
respondents who were students in the social 
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sciences, (6) those who are now operating a 
small self-run business, and (c) those who 
class themselves as Presbyterians. Among the 
religious scoring items, the theological items 
have increased more than the philosophical 
ones. The religious value scores of present 
undergraduates are reliably higher than for 
the men who took the test in 1940, and are 
remarkably similar to the respondents who 
took the test in 1956. 
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the members of a group are faced with a 
contingency such that for any one of them 

to try to lead is to invite personal rejection by 
the rest. This situation arises when, as the 
members see it, what must intervene between 
them and task success is bound to be danger- 
ous, obnoxious, exhausting, or generally 
unpleasant. Whoever proposes any task- 
relevant course of action is sure to be unpopu- 
lar, and yet the outcome of the project depends 
upon the initiation of group action. Under 
these restrictive circumstances, who can afford 
to attempt to lead his group? What is the 
individual like who is willing and able to take 
this social risk? These are the central questions 
that provoked the experiment to be described. 
In addition, supplementary data collected 
during the study provide us with information 
about productivity and morale under these 
special conditions. Ever since the pioneer 
experiment of Lewin, Lippitt, and White (11, 
12) social “meteorologists” have been inter- 
ested in the effects of leader behavior upon 
group “climate” and, also, in the effects of 
such atmospheric conditions upon group 
productivity. Our collateral analysis tells us 
something about what happens to group 
productivity and morale when the tables are 
turned, and group climate varies as a function 


T everyday life it sometimes happens that 


1This research was conducted under Contract 
N6ori-17 T.O. III (NR 171-123) between the Ohio 
State University Research Foundation and the U. S. 
Office of Naval Research. The opinions expressed are 
those of the authors alone. They wish to thank the 
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J. R. Hanson, who functioned as one of the trained 
“stooges”; and H. B. Pepinsky, who gave constructive 
criticism in the preparation of the manuscript. Sample 
copies of the forms used in the study are appended to 
the original technical report to ONR (14), which may 
be obtained on loan from the Gifts and Exchange De- 
partment of The Ohio State University Library. 


of the behavior of other members foward those 
who try to assume leadership. 

This experiment was designed primarily to 
test the hypothesis that when attempts to lead 
are repeatedly met by personal rejection, group 
members characterized by a relatively high 
need for achievement and a low need for 
affiliation will make such attempts more 
frequently than other members who have low 
needs for achievement, but relatively high 
affiliation needs. This prediction was based 
upon a provisional theory of leadership formu- 
lated by Hemphill (6), and the study is one of 
a series of four small-group experiments con- 
ducted to test some of the propositions stated 
in that theory. These studies (8, 9, 14, 18) all 
were concerned with the identification of 
motivational variables antecedent to attempts 
to lead, without regard for whether such 
attempts actually were followed by the group 
or did in fact lead te task success. In effect, the 
general question was, “What makes Johnny 
run,” rather than, “What makes Johnny get 
elected.” Throughout this exploratory research 
the major dependent variable was the observed 
frequency of “attempted leadership acts,” 
defined by Hemphill (6) as attempts to 
“initiate structure-in-interaction” through 
making task-relevant proposals or suggestions 
(cf. Bass et al., 3). 

In this particular experiment, the independ- 
ent variables were (a) the inferred needs of the 
subjects, as measured prior to the experiment, 
and (+) the special conditions to which they 
were then exposed. The conceptions of needs 
for achievement and affiliation used here are 
similar to but not identical with those pro- 
posed by McClelland e¢ al. (13), and by Shipley 
and Veroff (19). In Hemphill’s view— 

Need Achievement (n Ach) is inferred when 
observed behavior is interpreted as a consistent 
attempt by the individual to experience success 
in competition with some standard of excel- 
lence. 
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Need Affiliation (n Aff) is inferred when ob- 
served behavior is interpreted as a consistent 
attempt by the individual to establish, main- 
tain, or recover friendly, warm, or loving 
relationships with other persons. 

Hemphill introduces an emphasis upon 
consistency in behavior in different situations 
and over a period of time and regards such 
consistency as an index of the strength of the 
respective needs. In contrast, McClelland’s 
theoretical position has led to the development 
of measurement procedures that are critically 
dependent upon arousal in a specific situation. 

The situational arrangements in the present 
study were designed to establish one or the 
other of two conditions, but otherwise to main- 
tain equal opportunities for task success: 

The rejection condition (R) was intended to 
produce in the Ss the expectation that attempt- 
ing to lead would result in their personal rejec- 
tion by other group members. (Under this 
condition, attempting to lead was assumed to 
have satisfying consequences for need Achieve- 
ment, but dissatisfying consequences for need 
Affiliation.) 

The acceptance condition (A) was intended 
to produce in the Ss the expectation that 
attempting to lead would result in their per- 
sonal acceptance by other group members. 
(Under this condition, attempting to lead was 
assumed to have satisfying consequences for 
both need Achievement and need Affiliation.) 

Strictly interpreted, the stated hypothesis 
refers to behavior to be observed under condi- 
tion R only. But by adding condition A to the 
design, we could extend that information by 
relating it to what happened in a situation in 
which differences in needs were not expected 
to have a significant effect upon attempts to 
lead. 


METHOD 


Selection of Subjects 


By means of multiple screening procedures, 48 
white male Ss were selected from the students enrolled 
in the University introductory psychology course and 
assigned to 24 four-man groups, each comprised of two 
Ss and two “stooges.” The major criterion for selection 
of the Ss was their classification in respect to n Ach and 
n Aff. This classification was accomplished in two stages: 
first, potential Ss were prescreened according to their 
scores on a questionnaire, and, second, final selection 
was made in terms of independent ratings of their 
needs, based upon individual, structured interviews. 

The screening questionnaire consisted of two parts, 
a section that elicited general information and a series 


of items designed to measure the two needs. The first 
set of items permitted elimination of those individuals 
who did not meet criteria established to control the 
extraneous effects of status-related physical char- 
acteristics (e.g., age, physical size, and race). Items 
contained in the second section of the questionnaire 
were developed from an original pool of more than 400 
statements, which the individual members of the re- 
search staff devised as descriptive of behavior con- 
sistent with either high or low n Ach or n Aff. A pre- 
liminary 30-item form (15 items intended to measure 
each need) then was assembled from those items that at 
least five of six independent judges agreed were clearly 
relevant to one or the other of the given need defi- 
nitions. 

On the basis of a trial administration to 161 men en- 
rolled in the introductory psychology course, this sec- 
tion of the questionnaire was revised in order to elimi- 
nate nondiscriminating or apparently ambiguous items. 
In its final form the questionnaire contained 15 items, 
seven designed to yield a score for n Ach, and eight 
designed to yield a score for n Aff. The two scores are 
empirically independent; a correlation computed be- 
tween them for a sample of 335 men is —.02. From a 
scatter plot of the distributions of the two scores, 86 
individuals whose scores fell in the high-low quadrant— 
and who also met the physical criteria—were selected 
for interviewing. 

Individual interviews, which averaged about 15 
minutes in length, were conducted with the potential 
Ss by one or the other of two trained interviewers. The 
interviewer had no knowledge of the individual’s scores 
on the screening questionnaire, beyond the fact that his 
scores were extreme enough to qualify him for the inter- 
view. The specific questions asked (modifying a sched- 
ule used by Rafferty [15]) focused upon activities and 
interests (vocational, academic, social) that pre- 
sumably would provide some basis for judgments as to 
the consistency with which, in various situations and 
over an extended time period, an individual had en- 
gaged in behavior symptomatic of n Ach or n Aff. Im- 
mediately after his interview, the interviewer rated 
each person on the two needs on separate nine-point 
scales. For any individual to be eligible to serve as an 
S, it was stipulated (a) that there should be a dis- 
crepancy of at least two scale intervals between his 
ratings on the two needs, and (5) that the lower of the 
two ratings should not exceed the fifth or midpoint on 
the scale. If, in the interviewer’s judgment, an in- 
dividual had met all the specified selection criteria, he 
was assigned to a particular experimental group, each 
containing one S who was classified as relatively high 
on n Ach and low on n Aff and one classed as high n Aff, 
low n Ach. Final selection and classification were based 
solely on the interview. 


Task 


All groups worked on the same experimental task, 
which was designed to maximize the necessity for 
planning and coordinating group action, and, hence, to 
elicit a relatively large number of attempts to lead. The 
basic idea elaborated upon in the development of the 
task was suggested by the work of Guetzkow and his 
associates (5) with hypothetical business corporations. 
This task, the Manufacturing Problem, required the 
group members to organize as a toy manufacturing 
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concern and to operate their business for maximum 
profit. 

The physical arrangements for the task involved the 
use of four tables placed around the laboratory: (a) the 
“Supplier’s” table, where a large quantity and variety of 
Tinkertoy parts were available in appropriately labeled 
boxes; (6) a display table upon which five complete 
Tinkertoy models were arranged (a “top,” a “man,” 
an “airplane,” a “wagon,” and a “ladder”); the 
“Shop,” a large table upon which the “ ucts” 
were to be assembled, and upon which were arranged 
pads, pencils and order forms; and (d) a “Buyer’s” 
table at which finished toys were to be sold. 

At the outset, each group was given three dollars in 
cash with which to “set itself up in business” and was 
told that the members would be permitted to keep 
equal shares of whatever profit they made during the 
experiment. Itemized lists of supply costs and selling 
prices were distributed, and it was pointed out that 
both costs and prices would fluctuate every five min- 
utes throughout the two 20-minute work periods. In 
order to buy parts, the group had to fill out and submit 
to the supplier itemized order forms, each signed by all 
four team members and accompanied by sufficient 
money to cover the particular order. Each work session 
was preceded by a planning period (the first, 10, and the 
second, five minutes long); the first work session and the 
second planning period were separated by a 10-minute 
break. After the standard instructions had been read by 
the experimenter, the organization and operation of the 
business were left completely in the hands of the group. 
During the special planning periods no actual produc- 
tion was permitted, but discussion and revision of 
plans could be continued during the work sessions. 


Experimental Design and Procedure 


During their work on the experimental task half of 
the 24 groups were subjected to condition R and the 
other half to condition A, the conditions being alter- 
nated in RAAR order. In order to establish the condi- 


. hy tions, the same pair of accomplices was introduced into 


each experimental group to play preassigned roles. 
se was also made of the immediate feedback to the 
ttgoexperimental Ss in each group of the prearranged 
“résults” of a sociometric questionnaire administered 
twice during the experiment. Two trained observers, 
who observed the group through a one-way mirror, 
tallied the frequency of the leadership acts attempted 
during the experimental sessions by each S. The ob- 
severs were not informed about the need classification 
of the Ss, and the observational procedure was specified 
by rules developed in conformance with the theoretical 
definition of attempted leadership. The following 
procedural sequence comprised the complete ex- 
perimental period: (¢) a 15-minute preliminary or 
“warm-up” session, (5) the first administration of the 
sociometric questionnaire, (c) the first session of the 
experiment proper, (d) the second administration of the 
sociometric questionnaire, and (e) the second and final 
experimental session. 

The preliminary session, during which the group 
worked on a simple construction task, was intended 
primarily to provide a plausible basis for the first ad- 
ministration of the sociometric questionnaire. In this 
period, the stooges played relatively passive roles, but 
voluntarily undertook enough of the work of the group 


to prevent either one of the Ss from establishing him- 
self as the dominant leader. The sociometric question- 
naire consisted of seven items of the “guess-who” 
type, which elicited choices based upon criteria both of 
liking for them as individuals and of preference for them 
as leaders. After a brief interval during which the re- 
sults supposedly were actually tallied, identical reports 
were given privately to each S, and he was cautioned 
not to discuss his ratings with the other group members. 
The intensity of reported rejection or acceptance in- 
creased from the first to the second report, and ac- 
ceptance and rejection were varied to the same degree 
under both conditions. The feedback was concluded 
with a suggestion by the experimenter of the appro- 
priate inference to be drawn by the S, i.e., that he was 
being accepted (or rejected) because of the suggestions 
he was making to the group. 

Under Condition R, both stooges were instructed to 
show interest in and willingness to work on the task as 
such, but to indicate personal rejection of either S 
whenever he made an attempt to lead. This role was 
reversed under Condition A; whenever an S attempted 
to lead, the stooges indicated personal acceptance and 
approval of the initiator himself. In both instances, the 
stooges were to comply with all direct requests or 
orders and were themselves to make no attempts to 
lead. 

As soon as the experiment was over, the four group 
members (including the stooges, who simulated com- 
pliance) filled out a brief Postsession Questionnaire. 
The Ss also completed the Group Dimensions Descrip- 
tion Questionnaire (GDDQ) (Hemphill, 7) which yields 
scores on 13 dimensions, providing a “profile of an 
individual’s orientation (perception and attitudes) 
toward a group.” Finally, the research supervisor ex- 
plained the experimental procedures to the Ss and re- 
assured them about the adequacy of their performance. 


RESULTS 
Checks on Procedures 


Independent checks upon the experimental 
procedure provide information about (a) the 
reliability of the interview as a selection crite- 
rion, (6) the manipulation of the experimental 
conditions, and (c) the reliability of the meas- 
ure of the dependent variable. 

To obtain an estimate of interviewer agree- 
ment, one interviewer made independent need 
ratings based upon tape recordings of 32 inter- 
views conducted by the other interviewer. 
Since the two raters were responding to com- 
parable but not equivalent stimuli, the 
relationship between the two sets of ratings 
should be regarded as an attenuated index of 
interviewer agreement. Correlations of .67 
between their ratings of n Aff and .55 between 
their ratings of n Ach indicate a significant and 
positive, but not high degree of association. 
These correlations are, however, comparable 
to those obtained in studies of agreement in 
clinical judgments (e.g., Hunt, Arnhoff, and 
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Cotton [10] and Ash [2], (Only two of these 32 
individuals actually would have been placed in 
opposite categories by the two interviewers; as 
it happened, one of these Ss was run under A 
and the other under R.) 

Two months after the completion of the 
experiment and more than four months after 
the initial interview ratings had been made, a 
third interviewer held similarly structured 
interviews with 25 of the 48 former Ss and, 
without knowledge of the previous ratings, 
rated their needs at that time. Correlations 
between the later and the original ratings are 
A7 for ratings of n Ach and .42 for ratings of 
n Aff, both of which are significant at the .05 
level. These figures may be interpreted either 
as indices of the “test-retest reliability” of the 
interview measure or as giving some empirical 
support to the conception of needs as being 
relatively stable over a period of time. 

The two sets of sociometric ratings of their 
fellow group members by the 48 Ss and their 
responses to the Postsession Questionnaire 
provide us with means to check upon the 
manipulation of the experimental conditions. 
These data (reported in Pepinsky, Hemphill, & 
Shevitz [14]) support the assumption that the 
Ss’ phenomenal view of the situation did 
correspond to the experimental intent. Sepa- 
rate chi-square tests by item show that in 
general under Condition A favorable socio- 
metric choices were randomly distributed 
between the stooges and the Ss; under Condi- 
tion R, however, the stooges received a 
disproportionate share of negative choices. 

The Postsession Questionnaire contained 
six items designed to test whether the Ss 
felt relatively more personal acceptance under 
A and more rejection under R (e.g., “members 
seemed, a, to welcome, or, b, to resent my 
suggestions”). Chi-square tests by item of the 
association between conditions and the re- 
sponse categories are significant (at the .01 
level or beyond in five instances; at the .05 
level for the remaining item), and the trends 
are in the predicted direction. For each of two 
additional items, designed to determine 
whether the Ss believed the falsified reports 
fed back to them, the responses show that 
nearly all Ss under both conditions did be- 
lieve the ratings reported to them: only three 
of the 48 Ss thought that “the people in the 
group were not to be trusted”; four Ss “felt 
that the ratings were completely false.” 
Regardless of the Ss’ need classifications, their 




















TABLE 1 
ANALYSIS OF VARIANCE OF ATTEMPTED LEADERSHIP 
ScorEs 
| 
Source of Variation | df | Vari- | F F: 
| ance 
=— 

Between Conditions 1 |130.67 |76.42**|21.92** 
Between Needs 1 0.38 — — 
Between Sessions 1 5.04 | 2.95 — 
Between Blocks 11 4.86 | 2.84*| — 
Conditions X Needs 1 1.49); — _ 
Conditions X Sessions | 1| 0.67) — - 
Conditions X Blocks 11 S:a0 | 3.43* — 
Needs X Sessions 1 0.37 -~ — 
Needs X Blocks 11 2.60 | 1.52 — 
Sessions X Blocks 11; °0.68|; — = 
Conditions X Needs X| 1 4.17 | 2.44 — 

Sessions 
Conditions X Needs X| 11 5.96 | 3.49* —_ 

Blocks 
Conditions X Blocks 11 1.17 — _ 

X Sessions 
Needs X Sessions X 11 0.60 _ _ 

Blocks 
Residual 11 1.71 —_ — 
Total 95 

















* Significant at .05 level. 
** Significant at .01 level. 


responses to the Postsession Questionnaire 
corroborate the effectiveness of the experi- 
mental strategy. 

Separate intraclass correlations computed as 
checks upon interobserver agreement were .96 
for the pairs of tallies of leadership attempts 
made during Session 1, .92 for Session 2, and 
.96 for the total tallies made during both 
sessions. Thus, these tallies provide a highly 
reliable measure of the dependent variable. ,, 


Test of Hypotheses 


The data of major interest are the numbers 
of leadership acts attempted by the two classes 
of Ss under Conditions A and R. Each measure 
is the sum of the two observers’ tallies of the 
leadership acts attempted by each subject 
during one session. Since inspection revealed 
the presence of marked positive skewness, the 
raw data were transformed in order to obtain 
a more normal distribution and to stabilize 
the variance within treatments. This trans- 
formation was achieved by extracting the 
square root of the sum of the attempted 
leadership acts recorded by the two observers 
for each S$ during each session. The greatest 
integers smaller than these square roots are 
used as the individuals’ ‘‘attempted leadership 
scores.” 

Application of the analysis of variance 
permits a detailed examination of the effects 
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upon attempts to lead, not only of needs, but 
also of the other independent variables and 
their interactions. The results of this analysis 
are presented in Table 1. The F; column of 
the table reveals that the most conspicuous 
findings with reference to the major hypotheses 
are the failure of Needs and of the Conditions 
X Needs interaction to attain significance. 
It becomes evident, rather, that variation in 
the number of attempts to lead is definitely 
associated with the rejection-acceptance con- 
ditions. The effect of variation in the control 
of the conditions throughout the experiment, 
identified as Blocks (composed of two groups 
‘ each), is significant at the .05 level. The 
interactions of Conditions X Blocks and of 
Conditions X Needs X Blocks are significant 
at the same level. The F; column shows the 
results obtained, when the significant lower 
order effects containing these components are 
retested against the Conditions ~ Needs X 
Blocks interaction. The effect of Conditions 
remains significant at the .01 level, whereas 
now the Blocks and Conditions X Blocks 
values do not meet the .05 criterion. We may 
conclude, therefore, from this more stringent 
test that the difference attributable to Condi- 
tions is not simply a product of variations 
between Blocks; individual differences in 
attempts to lead are dependent upon the 
special conditions of acceptance or rejection 
under which the Ss worked on the task. 

A point biserial correlation coefficient was 
computed to obtain an estimate of the extent 
of the association between the experimental 
conditions and attempted leadership scores. 
The coefficient is .61, which, as an under- 
estimate of the true relationship, indicates 
that more than 36 per cent of the variance in 
attempts to lead may be accounted for by the 
conditions of acceptance or rejection to which 
the individual group members were subjected. 


Group Productivity and Morale 


Our records now permit us to examine the 
relationships between group productivity as 
measured by task success—i.e., amount of 
money earned—and four other variables: 
(a) attempted leadership scores; (b) experi- 
mental Conditions A and R; (c) judged quality 
of group decisions, where each requisition for 
mode! components countersigned by all group 
members is regarded as one such “decision’’; 
and (d) group member “morale” as measured 
by individual stanine scores on two dimen- 


sions of the GDDQ—Hedonic Tone and 
Viscidity. In brief, Hedonic Tone is “the 
degree to which group membership is ac- 
companied by a general feeling of pleasantness 
or unpleasantness”; Viscidity is “the degree to 
which members of the group function as a 
unit” without personal conflict and dissension 
(Hemphill, 7). 

What we find first is that there is a pro- 
nounced positive relationship between the 
groups’ profits and attempts to lead under 
Condition R, but that these variables are 
independent under Condition A. Kendall’s tau 
was used to evaluate the correspondence 
between the group rankings arranged accord- 
ing to (a2) amount of money earned and (6) 
order or magnitude of the summed attempted 
leadership scores of the two Ss in each group. 
Under Condition R, tau is .84 (P less than 
.001), while under A tau is .00. Next, we find 
that although the average profit under R is 
$1.80 as compared with $3.37 under A, the 
apparent difference is not reliable (¢ = 1.71), 
due to wide variability present under both 
conditions. 

The number of leadership attempts and the 
amount of group profit will be positively 
correlated only when suggestions are made 
and followed that consistently tend to be 
economically sound proposals. And yet, under 
Condition A, where no such relationship exists, 
net profit is at least as high as it is under Condi- 
tion R, where money earned is directly as- 
sociated with attempts to lead. These com- 
parisons suggested that there might be a 
qualitative difference in the character of the 
groups’ decisions under the two conditions, 
and so an analysis was made of the order 
forms submitted by the 24 groups. It was 
assumed that since each of these requisitions 
had to be signed by all the members of the 
group, each one represented group consensus 
about a plan of action, and they could properly 
be regarded as records of group decisions. Two 
judges independently assigned each completed 
form to one of two categories, “best” or “not- 
best” decisions. A “best” decision was defined 
as one that, if followed (i.e., if the toy for 
which specific parts were ordered was com- 
pleted and sold), would result in the maximum 
profit to be realized at a particular time from 
the money enc'osed with the form; and all 
other forms were placed in the “not-best” 
class. The judges agreed in their categorization 
of ail 131 order forms. A test of the inde- 
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pendence of the experimental conditions and 
the quality of group decisions yields a chi- 
square value of 6.55 (P less than .01) indicating 
that the two variables are significantly related. 
Departures of the observed cell values from 
expectancy are in the direction of fewer “not- 
best” decisions reached under Condition R 
and more decisions of this type reached under 
A. Tie A groups submitted more than twice 
as many orders as the R groups, or 88 as 
compared with 43 (a deviation from chance 
significance at the .01 level). But the latter 
groups turned in a disproportionately small 
number of orders representing ‘“‘not-best” 
decisions. Sheer volume of orders is not a valid 
measure of productivity on this task, since 
higher activity level is not necessarily ac- 
companied by significantly greater profit. 
Our impression is that under Condition R 
individuals were more self-critical and cautious 
about making suggestions than under Condi- 
tion A, where they felt free to express any and 
all proposals that occurred to them. 

When separate ¢ tests were made of the 
mean differences in their stanine scores on 
Hedonic Tone and Viscidity, the means of 
the 24 Ss run under A were significantly (P 
less than .01) higher than those of the 24 Ss run 
under R. The correlations between profits and 
stanine scores (averaged to obtain a group 
measure for each of the 24 groups) were 
significant at the .05 level and positive for 
both dimensions: Hedonic Tone, .48, and 
Viscidity, .40. Without regard to experimental 
condition there was a hardly surprising tend- 
ency for making more profit to be accom- 
panied by a general feeling of pleasantness 
and by less dissension. 


DISCUSSION 


To the extent that the operations performed 
did permit a test of the stated hypothesis, it 
must be rejected. When selected individuals 
were placed in a situation in which their 
attempts to lead resulted in personal rejection, 
those with a relatively high inferred need for 
achievement and a low need for affiliation 
were not able to lead more than those with 
relatively high affiliation and low achievement 
needs. We began with the question of who will 
be able to try to lead, if his attempts result 
in his own personal rejection. So far our best 
reply is, “Nobody,” or “Well, hardly any- 
body,” at least in the case of the “teraporary, 
ad hoc group” (Strodtbeck, 20). 








But despite the reported checks upon the 
experimental procedures, there still are grounds 
for doubt as to whether the experiment did 
permit an adequate test of the theoretical 
prediction. There are several alternative but 
not mutually exclusive possibilities: (a) the 
need measures were inadequate; (b) the two 
needs were not sufficiently differentiated in 
the samples employed; (c) an underlying 
characteristic of “other-directedness” (Ries- 
man, 16) is so typical of the contemporary 
college population that otherwise accurately 
classified “high need achievers” are unable to 
withstand strong personal rebuff by their 
peers; (d) the required experimental control 
was not maintained so that there was—in the 
view of the Ss—equivalent opportunity for the 
satisfaction of n Ach under both experimental 
conditions (cf. Asch, 1); and (or) (e) the 
observed reactions to such conditions would be 
different, if the groups involved continued to 
operate over a longer period of time. 

On the positive side, the experiment is a 
clear-cut demonstration of the importance of 
situational factors as predictors of attempts to 
lead. The results are unambiguous in support- 
ing the inference that, even after strenuous 
efforts have been made to select for study two 
classes of individuals whose behavior under 
“normal” circumstances indicates that they 
represent opposite “personality types,”’ those 
persons are more alike than different, when 
exposed to strong and consistent personal 
rejection or acceptance. Such extreme condi- 
tions produce wide variability in the frequency 
of leadership attempts, and the effect of these 
treatments cuts across and obliterates indi- 
vidual differences in the measured personal 
characteristics. This outcome is consistent 
with the current emphasis upon situational 
variables as determinants of leadership be- 
havior. 

Our collateral findings pertaining to group 
productivity and morale call to mind Shaw’s 
recent study (17) of group performance within 
various communication nets under appointed 
“authoritarian” and ‘“non-authoritarian” 
leaders. He compares his results with those of 
Lewin, Lippitt, and White’s (11, 12) earlier 
experiment where groups worked under trained 
“autocratic,” “democratic,” and “laissez faire” 
leaders. In both these cases, group climate was 
varied through the manipulation of the be- 
havior of appointed leaders, whereas what 
“sound like” similar atmospheres were induced 
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in the present experiment by different, but 
complementary procedures. Any generaliza- 
tions about the effects of experimentally 
created climates upon group productivity 
must be qualified to the extent that—in 
contrast to what happens in natural group situ- 
uations—the same conditions do not hold for al 
parties to the interaction. We have not been 
able, however, to resist the temptation to com- 
pare our own results with those of the other 
two studies on the provisional assumption that 
in a phenomenal sense the situations were com- 
parable. If we take this inferential leap, some 
interesting parallels and contradictions appear. 

These comparisons may be summarized in 
tabular form: 


7 : ian nee, . 
xperimen Quanti- q 
Authors Conditions tative tative Morale 


Lewin, Lip- (¢) Autocratic Higher Better un- Higher 
pitt, & vs. (6) demo- under (e) under (6) under (5) 
White cratic leaders than (5) than (a) than (a) 


Shaw (a) Authori- Higher Better un- “igher 
tarian vs. (6) under (e) der (e) under (5) 
nonauthori- than (5) than (5) than (a) 
tarian leaders 


Pepinsky, (a) Rejection No signif. Better Higher 
Hemphill, vs. (6) ac- difference under (¢) under (6) 
& Shevitz ceptance of between than (6) than (¢) 
potential (a) and 
leaders (6) 

When we take into account that the tasks 
assigned to the groups in the three studies 
were objectively quite different, that in 
Shaw’s experiment the Ss were limited to 
written communication, and that in the 
Lewin, Lippitt, and White study the Ss were 
children working under adult leaders, it seems 
remarkable that there is as much apparent 
agreement as there is among the authors’ 
conclusions. In all three experiments the 
morale of the Ss was higher when they were in 
situations characterized by “warmth and 
permissiveness” as opposed to those char- 
acterized by “coldness and criticism.” In the 
two earlier studies, however, productivity as 
measured by quantitative performance criteria 
was relatively high under autocratic or 
authoritarian leaders. In our study this kind 
of comparison shows no significant difference 
between conditions, but the absolute trend 
was in an apparently opposite direction. It 
may be legitimate to regard this result as a 
function of the manipulation of follower 
rather than leader behavior, but it is just as 
plausible to suspect that it is an effect of 


uncontrolled variation in the stooges’ be- 
havior. In two out of three studies a cold 
climate also seemed to produce qualitatively 
better performance. Lewin, Lippitt, and White 
say otherwise, but their statement seems to be 
based on informal judgment only. 

In general, the results of these laboratory 
experiments are congruent with the repeated 
findings of studies made in field situations 
(Bradyfield & Crockett, 4) that “morale” and 
group “productivity” are seldom significantly 
and positively related, regardless of wide dif- 
ferences in the definition and measurement 
of these variables. These empirical data are, 
of course, offensive to the democratic Zeitgeist. 
But at least in the case of the experiment we 
have just reported, we can say that, although 
groups more frequently made “‘poor’’ decisions 
under acceptance conditions than under 
rejection conditions, they compensated for 
this by their greater activity so that in the 
long run they did at least as well in terms of 
money earned—and were happier besides! 


SUMMARY 


This report has described an experiment 
that was designed to test the hypothesis that 
under conditions such that to attempt to lead 
is to invite rejection, group members char- 
acterized by a relatively high need for achieve- 
ment and a relatively low need for affiliation 
would make more frequent attempts to lead 
than would other members having relatively 
high affiliation and low achievement needs. The 
experimental design provided for a test of this 
hypothesis under controlled laboratory condi- 
tions and involved a total of 24 four-man 
groups. 

The principal experimental findings are: 

1. Under Condition R (rejection) individuals 
with high need Achievement did not attempt 
to lead with greater frequency than individuals 
with high need Affiliation. Therefore, the 
central hypothesis is not supported by the 
data. 

2. Differences in attempts to lead are 
clearly attributable to differences in the 
experimental conditions to which the group 
were subjected, rather than to the major needs 
specified. There were significantly more at- 
tempts to lead under Condition A (acceptance) 
than under Condition R, regardless of indi- 
vidual differences in inferred needs. 

3. A collateral analysis of group produc- 
tivity, as measured by money earned, shows 








54 PAULINE N. Pepinsky, Joun K. Hempuitt, anD REUBEN N. SHEVvITZz 


that there was not a significant difference in 
net profits under the two conditions, but that 
fewer poor group decisions were made under R. 
Under Condition A, however, there was higher 
morale, and the groups compensated for their 
less consistently good decisions by maintaining 
a relatively high activity level so that, in the 
long run, they made as much money as the 
less sanguine rejection groups. 

The results suggest that if we are to under- 
stand better the development of characteristic 
group atmospheres and the effects of such 
conditions upon behavior in natural groups, we 
do well to take into account the treatment of 
the leader by others, as well as his treatment 


of others. 
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THE EFFECTS OF ANXIETY LEVEL AND PSYCHOLOGICAL STRESS 
ON VERBAL LEARNING 


JANET A. TAYLOR 
Northwestern University 


SING the Manifest Anxiety Scale 
(MAS) as a selective device, a num- 
ber of studies have been conducted 
by Spence, Taylor, and their associates in an 
attempt to test hypotheses derived from 
Hullian theory concerning the relationship 
between the performance of human Ss in cer- 
tain learning situations and level of total effec- 
tive drive. Use of the MAS in this connection 
has rested on the assumption that the scores are 
related to level of emotionality and drive (D). 
Considerable experimental support has been 
found for these hypotheses (hereafter referred 
to, for convenience, as drive theory), thus con- 
firming both the notion that MAS scores reflect 
level of D and the postulated interactions be- 
tween drive level and learning task variables 
(10). 

The major purpose of these studies has been, 
as indicated above, to investigate hypotheses 
concerning the role of D, the MAS having been 
developed and used merely as a convenient 
instrument for selecting Ss differing in drive 
level. It is also possible, as many have pre- 
ferred to do, to view the results of the studies 
comparing the performance of extreme scorers 
on the MAS as throwing some light on the 
personality pattern of manifest anxiety itself. 
There are undoubtedly many characteristics 
other than drive level on which such high anx- 
iety (HA) and low anxiety (LA) Ss differ; 
identification and incorporation of such char- 
acteristics into drive theory should broaden 
the usefulness of the latter and provide further 
understanding of anxiety as a personality 
variable. 

One such characteristic, mentioned by sev- 
eral writers (1, 4, 6, 10) as possibly differen- 
tiating between the two extreme anxiety 
groups, concerns their reactions to psycho- 
logical stress (e.g., reports of inadequate per- 
formance in an experimental task). The pres- 
ent writer, for example, has suggested that 
under conditions of psychological stress, in- 
ternal responses of the type that Child (1) has 
referred to as task-irrelevant, are more easily 
or more intensely aroused in HA Ss than in 
LA: In experimental tasks in which such 


extratask responses interfere with efficient per- 
formance (e.g., verbal learning), HA Ss under 
stress conditions would therefore be expected 
to be inferior to neutral control Ss, whereas the 
performance of LA Ss should be affected to a 
lesser degree (10). 

These hypotheses were suggested by the 
results of studies by Gordon and Berlyne (3) 
and by Lucas (4). In both of these investiga- 
tions, HA Ss told that their performance on a 
verbal learning task was inadequate were in- 
ferior in performance on a subsequent task to 
HA groups run under neutral conditions. The 
LA groups, in contrast, showed no decrement 
in performance under stress, when compared 
to their control groups. While the results of 
these studies seem to confirm the notion that 
HA Ss are more liable to make interfering 
extratask responses under conditions of psy- 
chological stress, they could also be attributed 
simply to a greater emotional reaction to the 
stress instructions and hence a greater increase 
in drive level on the part of the HA groups. 
That is, the empirical predictions generated 
from drive theory state that increases in drive 
level facilitate performance in relatively 
simple tasks in which a single response ten- 
dency is evoked (e.g., classical conditioning); 
in more complex situations in which compet- 
ing intratask responses are evoked and the 
correct response tendency is relatively weak, 
high drive (anxiety) Ss tend to lose their supe- 
riority and, as the number of incorrect respon- 
ses become greater and/or more dominant, to 
become inferior. Both the Gordon and Berlyne 
and the Lucas studies employed learning tasks 
of the competitional type. Thus it could be ar- 
gued that the HA Ss reacted with greater emo- 
tionality than LA groups to the stress condi- 
tions and the resultant increase in drive was 
responsible for their performance decrement. 

The present study was designed to provide 
an experimental arrangement in which the 
effects of increasing drive levels would be 
expected to result in differences between HA 
and LA Ss in the opposite direction to those 
expected if extratask, interfering responses 
were aroused by the stress condition. Specifi- 








56 Janet A. 








cally, verbal learning tasks were designed in 
such a manner as to minimize competing intra- 
task responses (11). Hence, drive theory would 
predict that under neutral conditions, a HA 
(high drive) group would perform at a higher 
level than LA. If the introduction of stress 
results simply in an increase in drive level, and, 
further, HA Ss are more reactive to such 
stress, these Ss should increase their margin 
of superiority over the LA (when compared to 
neutral groups). If, on the other hand, the 
major effect of stress is to arouse competing 
extratask responses the HA group should no 
longer exhibit a performance superior to the 
LA and may even be inferior to them. 


METHOD 


Subjects. ‘The high anxiety (HA) group consisted of 
40 Ss who scored 23 or above on the MAS and the low 
anxiety (LA) group of 40 Ss scoring 9 or below. Each 
group was further subdivided into a stress and a neutral 
group of 20 Ss each, individuals being alternately 
assigned to the subgroups as they appeared to be 
tested. All Ss were drawn from introductory psychology 
classes to whom the MAS had been given and were 
naive both with respect to previous experience in verbal 
learning studies and the purpose of the experiment. 

Materials. A practice list and two equivalent experi- 
mental lists were employed, each consisting of eight 
pairs of nonsense syllables. The practice list contained 
syllables of 47-53 per cent association value (2) and 
the experimental lists syllables of 0-20 per cent asso- 
ciation value. Both interlist and intralist similarity 
were at a minimum. The specific lists employed were 
ones that had been used in a previous study (11) in 
which it had been predicted and found that HA Ss 
perform at a level superior to LA Ss. 

The lists were presented by means of a Hull-type 
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MEAN NuMBER OF ToTAL CorrECT RESPONSES OF THE 
HA anp LA Groups ON THE First 
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memory drum at a 2-2 sec. rate with 4 sec. between 
trials. The syllables were typed on an endless white 
tape, three orders being prepared for each list to 
minimize serial learning. 

Procedure. Following preliminary instructions, which 
included the statement that performance on learning 
tasks is related to intelligence, each S was given the 
practice list for 10 trials and then each of the two ex- 
perimental lists for 15 trials. Order of presentation of 
the two experimental lists was counterbalanced within 
each of the four subgroups. For Ss in the neutral, 
nonstress subgroups, the instructions given before each 
of the experimental lists indicated merely that S was 
to learn another list of the same type. Before receiving 
the second experimental list, however, Ss assigned to 
the stress condition were told that their performance 
on the previous lists bad been considerably poorer than 
most Ss, asked whether they had been trying hard or 
felt well and told that it was hoped they would be able 
to improve their record. At the end of the experiment, 
all stress Ss were told of the deception, assured that 
their performance had been adequate, and cautioned 
not to discuss the nature of the experiment with others. 


RESULTS 


On the first experimental list, given to all Ss 
under neutral conditions, the HA group ex- 
hibited the superiority in performance pre- 
dicted by drive theory, as may be seen in Table 
1. The total number of correct responses given 
by each S were analyzed statistically by a 2 x 2 
analysis of variance summarized in Table 2, 
which permitted a test of the equivalence of 
the subgroups as well as of the effects of anx- 
iety level. The F for the HA-LA comparison 
was significant (p < .05), as anticipated. The 
terms for conditions and for the interaction 
between groups and conditions were not sig- 
nificant, thus suggesting that within each 
anxiety group, the subgroups were adequately 
matched. 

Perhaps the clearest demonstration of the 
effects of the stress condition on second list 
performance is afforded by obtaining a differ- 


























HA LA 
—___—_ ence score for each S (total number correct 
M 2. . | responses on List 2 minus total number on List 
ee osc 1). The mean difference scores for each of the 
TABLE 2 
ANALYSES OF VARIANCE OF NuMBER OF ToTat Correct RESPONSES ON THE TwO EXPERIMENTAL LIsTS 
I List 1 List 2 
Source SS df MS F* SS df MS F* 
Groups 1036.45 1 1036.45 4.11 1224.49 1 1224.49 3.88 
Conditions 288.45 1 288.45 1.14 3137.51 1 3137.51 9.95 
GxC .70 1 .70 —_ 70.44 1 70.44 — 
Within 19187 .90 76 252.4 23973 .05 76 315.44 





* For 1 and 70d/, F = 3.98 at .05 level and at. 701 at .01 level 
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TABLE 3 
Mean DIFFERENCE BETWEEN THE Two EXPERI- 
MENTAL Lists (List 2 — List 1) FoR THE 
Four SuBGROUPS 








HA LA 
Stress Neutral 


—4.10 +3.00 
16.93 9.66 





Stress Neutral 


—5.10 +-5.25 
14.87 12.36 





four subgroups are reported in Table 3. While 
both of the subgroups tested under neutral 
conditions improved in performance on the 
second list (presumably because of practice 
effects), the two stress subgroups were poorer. 
An analysis of variance of the second list 
data, shown in Table 2, indicated that the F 
for conditions was significant at the .01 level 
while the F for the anxiety groups was signifi- 
cant between the .05 and .10 levels. The F for 
the interaction between groups and conditions, 
however, was less than 1.00. Thus, although 
the stress condition had the effect of interfer- 
ing with performance, the HA Ss did not show 
any greater responsiveness to it and within 
each experimental condition, maintained their 
previous margin of superiority over the LA. 


DISCUSSION 


To summarize the results of the present 
study, HA Ss under the neutral, nonstress 
conditions performed at a superior level to the 
LA group, as was predicted by drive theory. 
The subgroups told that their performance had 
been inadequate showed a significant decre- 
ment in subsequent performance when com- 
pared to their neutral controls. If it is assumed 
that the superior performance of the HA Ss 
tested under neutral conditions was due to 
their higher drive level, the poorer perform- 
ance of the stress groups can be taken as indi- 
cating that the major effect of the stress 
instructions was to arouse extratask irrelevant 
responses which interfered with efficient per- 
formance. No evidence was found, however, to 
support the contention that HA Ss are more 
prone to exhibit these extratask responses, i.e., 
there was no interaction between anxiety level 
and the stress-neutral conditions. Superficially, 
these results appear to contradict the previous 
studies of Gordon and Berlyne (3) and of 
Lucas (4), both of which found a greater in- 
feriority of HA groups under stress conditions 
than under neutral conditions. That is, whether 


one wished to interpret their results as due to 
the greater emotional responsiveness (and 
hence proportionately higher drive level) of 
HA Ss to the stress or to a greater tendency to 
make irrelevant responses under stress, some 
type of differential reaction to stress by the 
HA and LA groups might have been antici- 
pated in the present study. One possible recon- 
ciliation is to assume that in HA Ss both drive 
level and extratask responses increase pro- 
portionately more in stress situations than in 
LA; in the present situation (employing 
materials with low intratask competition) 
these two factors affect performance in 
opposite directions, increased drive tending 
to facilitate and irrelevant responses to inter- 
fere. Thus the two may have cancelled each 
other. In the previous studies, on the other 
hand, the learning tasks were such that both 
factors could be assumed to contribute to 
performance decrement, thus leading to the 
greater apparent susceptibility of the HA 
groups to stress. 

The results of the present study also have 
bearing on several alternative hypotheses that 
have been offered in place of drive theory to 
explain differences in performance that have 
been found between HA and LA groups. The 
hypothesis that HA Ss react to stress situations 
with irrelevant responses to a greater degree 
than LA was originally proposed by Child 
(1). Child, however, applied this notion to all 
of the studies carried out to evaluate drive 
theory, including those in which no attempt 
was made to introduce psychological stress. 
In common with proponents of drive theory, 
Child also assumed that HA Ss react more 
emotionally and hence have a higher drive 
level in experimental situations than LA. 
He stated further that for the HA individual, 
this emotionality always serves as a cue for 
the evocation of strong irrelevant extratask 
responses, responses which are weak or rela- 
tively lacking in LA Ss. In simple types of 
situations (e.g., classical conditioning) in 
which performance of the correct response is 
relatively invulnerable to the disruptive effects 
of irrelevant responses, HA Ss should be supe- 
rior due to their higher drive level. The per- 
formance of relatively complex tasks (e.g., 
verbal learning), however, can be disrupted 
by extratask responses. The extratask re- 
sponses of HA Ss interfere with performance 
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more than their higher drive level facilitates it, 

thus resulting in the performance of HA Ss 

being inferior to LA groups in these situations, 

even under so-cailed “neutral” conditions. 

The addition of stress instructions serves to 

increase the drive level and, hence, the number 

and/or intensity of irrelevant responses in 

HA Ss, and therefore magnifies their infe- 

riority. 

The data available at the time Child offered 
his hypotheses concerning the correlation 
between drive level and extratask responses 
in HA Ss seemed to support his contentions 
about as well as they did drive theory, which 
emphasized the importance of intratask 
response competition in determining the rela- 
tive performance of HA and LA groups. 
However, several later investigations of verbal 
learning (5, 8, 9, 11) have demonstrated the 
superiority of HA Ss in tasks with minimal 
intralist interference. As the writer has pre- 
viously pointed out (10), these results, while 
predicted by drive theory, seem to weaken 
the plausibility of Child’s rival hypothesis 
since it would appear to be difficult to maintain 
that verbal learning tasks of low intralist com- 
petitiveness are not susceptible to extratask 
interference. The design of the present study, 
however, appears to provide a direct test of 
Child’s hypotheses. That is, a superior per- 
formance of HA Ss, as compared to LA, would 
indicate, according to Child’s notions, that the 
task employed must have been of the type in 
which performance is relatively unaffected by 
the presence of irrelevant responses. If stress 
were introduced into a situation employing 
such a task, the drive level of HA Ss (and 
perhaps of the LA as well although to a lesser 
degree) and hence the magnitude of irrelevant 
responses would be raised. But, since the latter 
cannot interfere with performance on this type 
of task, the HA group, with their increased 
drive, should show even greater superiority of 
performance when compared to LA Ss than 
when operating under neutral conditions. 
Under the neutral conditions of the present 
study, HA Ss were superior to the LA, in- 
dicating that the lists, according to Child, 
were relatively invulnerable to extratask 
responses. For the stress condition, Child 
would therefore predict an increased margin of 
superiority of the HA group over the LA, as 
well as the superiority of the HA stress Ss over 
the HA neutral group. The actual results, 


TAYLOR 


of course, showed the opposite: a decrease in 
performance in both stress groups and to the 
same extent in the LA and HA groups. Thus 
Child’s hypotheses concerning the crucial role 
of irrelevant response tendencies were not up- 
held. 

In a recent study, Saltz and Hoehn (6) also 
proposed an alternative to drive theory based 
in part on differential reactions of HA and LA 
groups to stress. They suggest that previously 
reported differences between HA and LA 
groups {under neutral conditions) are not 
related to the interaction between drive level 
and the number and relative strengths of 
competing intratask responses as drive theory 
has postulated but rather to the “difficulty” 
of the tasks. As task difficulty (which, they 
state, had previously been confounded with 
task complexity) increases, the anxiety level of 
HA Ss rises proportionately more than that of 
LA Ss, and this anxiety, in some manner which 
they did not specify, leads to response decre- 
ment. 

These speculations, insofar as they are 
susceptible to empirical evaluation, do not 
appear to be supported by available evidence. 
The lack of differential reaction of the anxiety 
groups to stress in the present study casts 
doubt, of course, on the special role of this 
variable. With respect to the effect of difficulty, 
their statements imply that in “easy” tasks, 
HA Ss should be equal in performance to LA 
and become increasingly inferior as tasks be- 
come difficult. A number of investigations 
(10) involving both nonverbal and verbal 
tasks (including those of the present study) 
have, however, shown HA groups to be superior 
in performance. Further, examination of learn- 
ing studies does not show, in as far as it is 
possible to make judgments of relative diffi- 
culty, any consistent relationship between the 
“ease” or “difficulty” of a task and the direc- 
tion of performance difference between HA 
and LA groups. For example, the experi- 
mental lists of the present study consisted of 8 
(noncompetitiorial) pairs of nonsense syllables 
of 0-20 per cent association value in contrast 
to 15 (noncompetitional) pairs of adjectives 
employed in an investigation by Spence, 
Farber, and McFann (8). In spite of its greater 
length, the adjective list was considerably 
“easier” than the first experimental list of the 
present study (e.g., at trial 10 for the total 
groups, a mean of 3.3 correct responses’ out of 
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8 syllables vs. a mean of approximately 14.3 
out of 15 adjectives). Although there were 
marked differences in difficulty, HA Ss were 
superior on both lists. Spence, Farber, and 
McFann also reported the results of a second 
adjective list, involving both competitional 
and noncompetitional pairs. This list, too, was 
“easier” than the syllable list of the present 
study, yet HA Ss were superior to the LA on 
the noncompetitional pairs and inferior on the 
competitional pairs. 

In their investigation, Saltz and Hoehn 
manipulated, variously, the association value 
and formal intralist similarity of serial lists of 
nonsense syllables in an attempt to obtain lists 
of varying difficulty level and degree of 
intralist competition. Their results, they 
stated, were contradictory to drive theory 
and supportive of their difficulty notions. A 
number of features of the study, however, make 
their findings difficult to interpret. The types 
of lists employed and their rules of construction 
make predictions in terms of drive theory 
troublesome if not impossible. As Spence (7) 
has pointed out, in order to make predictions 
about any learning situation from drive theory 
it is also necessary to have a specific theory 
about the variables involved in the learning 
task. Neither Spence nor the writer has pre- 
sented a theory of serial learning, let alone any 
hypotheses as to the role of association value of 
nonsense syllables, nor did Saltz and Hoehn. 

Quite independent of the adequacy of the 
lists for testing drive theory, a closer examina- 
tion of their results does not supply the sup- 
port for the difficulty hypothesis that the 
writers assert. Saltz and Hoehn employed three 
scrial lists of equal length, specifying that the 
relative difficulty of each be defined in terms 
of the performance (number of trials to 
criterion) of the LA Ss. Two of the lists (of 
competing 90 per cent association value mate- 
rial and of noncompeting 13 per cent association 
material) were chosen to be equal in difficulty 
(i.e., no significant difference in the per- 
formance of LA groups). On both of these lists, 
there was no significant difference in the per- 
formance of the HA and LA groups.' On the 


1 Saltz and Hoehn made no prediction about whether 
or not any differences should be expected between HA 
and LA Ss (although it might be pointed out that in 
terms of the type of materials selected and the mean 
number of trials to criterion taken by the groups, the 
lists were, in usual terms of reference, on the difficult 


third list consisting of noncompetitional 0% 
association value material the HA were 
significantly inferior to the LA, the approxi- 
mate difference in mean number of trials to 
criterion being a striking 13 trials. This in- 
feriority constituted the writers’ evidence for 
support of their difficulty hypothesis. Their 
conclusion rests on the stated assumption, that 
was not tested, that the zero per cent list was 
more difficult than the 13 per cent list (on 
which LA and HA did not differ). When the 
performance on the zero per cent and 13 per 
cent lists are compared for the LA groups, 
whose performance, it will be recalled, had been 
designated as defining the relative difficulty of 
the tasks, practically identical mean trials to 
criterion can be noted. The general literature 
on the effects of zero vs. 13 per cent association 
value does not make this equivalence sur- 
prising but it does fail to confirm the critical 
assumption that their zero list was more 
difficult. Thus, we are faced with what rea- 
sonably might be called three lists of equal 
difficulty, the HA Ss being equal in per- 
formance to the LA on two of them and 
inferior by a disconcertingly large margin on 
the third. This inconsistency might be attrib- 
uted to the writers’ choice of military person- 
nel (airmen) as Ss. Correlations between 
measures of intellectual ability and MAS 
scores have often been found with such intel- 
lectually heterogeneous groups. While in part 
such correlations might be attributed to the 
disruptive effects of high anxiety (drive?) 
level on the performance of complex tasks, 
there is good evidence (12) to suggest that 
genuine intellectual differences are represented 
among high and low scoring groups drawn from 
such intellectually heterogeneous populations 
due to the greater ability and/or willingness of 
the more intelligent to recognize the purpose of 
the scale and distort their answers in a 
“healthy” direction. Thus, their series of 
groups may have been contaminated in 
unknown degrees by shifting intelligence levels, 
“false negatives” in the low scoring groups, 
etc. In short, previous investigations do not 
support the difficulty hypothesis, while the 





side and some inferiority of the HA Ss might have 
been expected). The only “confirmation” of the diffi- 
culty hypothesis in this comparison lay in the rather 
negative finding that with two lists of equal difficulty, 
the lack of significance found between the HA and LA 
groups in one list was also found in the other. 
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Saltz and Hoehn study itself seems inadequate 
to provide any meaningful test. 


SUMMARY 


On the assumption that scores on the MAS 
reflect drive level, a number of studies have 
been conducted with high and low anxiety 
groups to test hypotheses concerning the inter- 
action between drive level and certain learning 
task variables in determining performance. 
The present study attempted to investigate 
the suggestion that another characteristic 
on which high and low anxiety Ss differ is their 
susceptibility to irrelevant, extratask responses 
under conditions of psychological stress. It 
was predicted that under neutral conditions 
high anxiety (high drive) Ss would exhibit a 
performance superior to that of low anxiety 
(low drive) Ss on a paired-associate learning 
task with minimal intratask interference but 
that under conditions of psychological stress 
(report of inadequate prior performance) 
high anxiety Ss, due to the greater arousal of 
interfering extratask responses, would no 
longer exhibit the superiority found under 
neutra! conditions. Results indicated that while 
the high anxiety Ss under neutral instructions 
were significantly superior to the low anxious, 
as predicted, and the Ss operating under stress 
were inferior to their neutral controls, the 
predicted interaction between anxiety level 
and stress was not found. The implications of 
these results for several rival hypotheses of 






the drive interpretation of anxiety were 
discussed. 
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PATTERNS OF MANAGERIAL TRAITS AND GROUP EFFECTIVENESS 


EDWIN E. GHISELLI anp THOMAS M. LODAHL 
University of California 


N some group task situations the final 
| product is merely the sum of the efforts 
of the individual members. In others, the 
productive end requires an interaction among 
the members and therefore their efforts must 
be coordinated. Particularly with the second 
type of situation, a substantial portion of the 
group’s attention must be devoted to planning 
and the integration and direction of the activi- 
ties of the individual members. Planning, 
integration, and direction may be termed man- 
agerial functions inasmuch as they pertain to 
the organization of the operations of the in- 
dividuals so that they are performed in a 
harmonious and effective manner with respect 
to achievement of the group goal. 

We are defining managerial functions, then, 
as those which are concerned with the organiza- 
tion, government, and control of the activities 
of the members of the group. These functions 
involve such activities as planning, policy 
making, execution, supervision, and decision 
making. Consequently, managerial traits would 
be those psychological characteristics of the 
individual that directly pertain to these func- 
tions. It can be argued that all characteristics 
of the individual are included within this 
definition of managerial traits. Thus, intelli- 
gence might be termed a mana,erial trait 
since it is related to effectiveness in planning 
and judgment. However, we do not here wish 
to make any rigorous classification of traits 
but merely to distinguish a class of them which 
in terms of the operations involved seem in one 
way or another to be intimately associated 
with managerial functions. The investigation 
of such traits would seem to be especially 
helpful in seeking to understand the contribu- 
tion of individuals to the group endeavor. 

There are two managerial traits which seem 
to us to be of particular significance in terms of 
the operation of the group as an organization. 
One trait is that of supervisory ability. It is 
apparent that there are a variety of ways in 
which supervision can be accomplished and 
that effectiveness of supervision is determined 
by a variety of situatienal factors such as the 
group’s culture, the physical facilities, and the 
group task goal. The effectiveness of a super- 


visor can be gauged in three ways. First, the 
product of the group he supervises can be com- 
pared with that achieved by the same or 
similar groups working with different super- 
visors. Secondly, it can be evaluated in terms 
of the perception of his effectiveness on the 
part of those he supervises, as by subordinates’ 
ratings of their superiors. Finally, a super- 
visor’s effectiveness can be gauged by how he is 
perceived by his superiors, as by means of 
superior’s ratings of their subordinate super- 
visors. While all of these ways of evaluating 
the effectiveness of supervision provide per- 
tinent information, the latter would appear to 
have special significance with respect to or- 
ganizational problems since it is accomplished 
by the governing persons. It is with supervi- 
sory ability as evaluated in this way that we 
wish to concern ourselves in the present 
investigation. 

The second trait is concerned with what 
might be termed the decision-making ap- 
proach. At one extreme there are those in- 
dividuals who with confidence in their own 
individual talents and resourcefulness have the 
capacity to make decisions on their own. At 
the other extreme are those individuals who 
are careful planners and who feel the need to 
act with caution and to examine all the facts 
before arriving at a decision. We believe that 
this particular dimension of decision-making 
approach should be investigated in relation to 
group functioning since in industrial situations 
it differentiates between top management 
personnel, that is, those who give general 
direction to the organization, and middle 
management, those who carry out general 
policy and give it specific implementation (2). 

The purpose of the present investigation is 
to examine the relationship between the distri- 
bution of the traits of supervisory ability and 
approach to decision making, as here defined, 
in relation to the organizational characteristics 
and performance of small groups in a task 
situation requiring continuous interaction and 
coordination among the members. Specifically, 
with respect to both traits we wish to examine 
the importance of (a) the average amount of 
the trait as possessed by the members of a 





EpwIn E. GHISELLI AND THomas M. LopAHL 


TABLE 1 


RELIABILITY COEFFICIENTS OF AND INTERCORRELATIONS AMONG THE INDICES OF LEARNING 








Index 


4 + 9 








. Total Trips 

. Total Wrecks 

. Trips Last 4 Trials 

Wrecks Last 4 Trials 
Wrecks/Trips, Total 

. Wrecks/Trips, Last 4 Trials 
. Correlation Trips and Trials 
. Rate of Learning, Trips 
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.017 
.414 
186 
367 
-601 











group, (d) the presence in the group of a person 
standing high in the trait, (c) the presence of a 
person who relative to the other members of 
the group is uncontested in the amount of the 
trait he possesses, and (d) the presence of such 
an uncontested person when the remainder of 
the group is homogeneous in the trait. 


METHOD 


After exploring several possibilities for a task sit- 
uation, it was decided that a small model railroad 
that required multiple controls to operate was most 
appropriate. The task was complex, involving both 
intellectual and motor components, and required close 
cooperation among all members of the group. There 
were two sets of identical control panels; half of each 
set contained controls for siding switches, and the 
other half, the power controls for the various sections 
of track. Up to four people could operate the railroad 
simultaneously. There were two trains on the track, 
and the task was to run both trains around the track 
in opposite directions as many times as possible in a 
three-minute trial. The group was given two points for 
each complete circuit of the track, with the proviso 
that both trains must make the same number of trips 
in any one trial. In order to minimize recklessness, the 
group was penalized five points for each wreck or 
derailment. There were 12 three-minute trials, with a 
one-minute rest period after each trial. The Ss were 
allowed to talk to each other at any time. 

The Ss were males, drawn from courses in psychol- 
ogy. Included in this analysis are data on 90 Ss, 10 
groups each of 2, 3, and 4 people. 


Measurement of Experimental Variables 


Performance in a learning task can be measured in a 
variety of ways. It is possible to take cognizance of 
correct responses and errors, total performance during 
the practice period and rate of improvement, final level 
of proficiency and consistency of performance. In the 
model railroad situation used in the present investiga- 
tion, eight different indices were obtained: number of 
total trips, total number of wrecks, number of trips 
during the last four trials (as a measure of final level of 
performance), number of wrecks during the last four 
trials, the ratio of wrecks to trips during the entire 
practice period (as a measure of quality of perform- 





ance), the ratio of wrecks to trips during the last four 
trials, the coefficient of correlation between trips by 
trial and number of trial (as a measure of consistency 
of performance), and the slope of the learning curve of 
trips (as a measure of rate of improvement). The best 
fitting straight line was taken as an indication of slope. 
Wrecks declined only slightly during the practice 
period and hence slopes were not determined for them. 

The eight measures of performance first varied with 
the size of the group. In general the larger the group the 
better the performance as measured by indices in- 
volving trips and the poorer as measured by indices 
involving wrecks. Furthermore, the distribution of 
scores on all indices, particularly the last six, were 
skewed. To adjust for differences in performance re- 
sulting from group size and to correct for skewness, 
the scores on each of the eight indices within each sized 
group were transmuted to normal standard scores. 

For the first six indices, reliability coefficients were 
determined from the coefficient of correlation between 
group normalized scores on odd and even trials ad- 
justed by the Spearman-Brown formula. These coeffi- 
cients are shown in the diagonals of Table 1. Two of 
these reliability coefficients are quite high, one reason- 
ably satisfactory, two borderline, and one a bit low. As 
a whole these various indices of group performance 
display moderately satisfactory precision describing 
the performance of the groups. 

In order to examine the dimensionality of these 
indices the intercorrelations among the eight scores 
earned by the groups were calculated. These coefficients 
are shown in Table 1. Following Tryon’s (3) method, 
the intercorrelations among the eight measures were 
subjected to a cluster analysis. The results indicated 
that there are two principal clusters as shown in Table 
2. One cluster includes all of the measures involving 
trips, and the other cluster the measures involving 
wrecks and quality of performance. It would seem 
reasonable to label the first cluster productivity and 
the second cluster recklessness, or, if the direction of 
the dimension is reversed, carefulness. 

Table 2 shows that the measures that define each of 
the clusters are substantially correlated with the do- 
main. If the four measures defining the first cluster are 
taken as a sample of the domain, the scores on the 
sample are estimated to correlate with domain scores 
.938. Taking the four measures defining the second 
cluster as a sample, scores on the sample are estimated 
to correlate with domain scores .937. Therefore, both 
samples are quite good indicators of their domains. The 
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TABLE 2 


COEFFICIENTS OF CORRELATION BETWEEN MEASURES 
AND CLusTER DoMAINS 








Cluster Domain 


Measures Defining Cluster I I 








Total Trips ‘ 
Trips Last 4 Trials .929 
Correlation Trips and Trials 

Rate of Learning, Trips 





Measures Defining Cluster II 





Total Wrecks .027 
Wrecks Last 4 Trials — .009 
Wrecks/Trips, Total — .547 
Wrecks/Trips, Last 4 Trials —.514 





coefficient of correlation between scores on the two 
domains, production and carefulness, is estimated to 
be .319. Thus, while scores on the two domains are 
not completely independent their relatedness is low. 
There appears to be a slight tendency for groups who 
are relatively more productive to be relatively more 
careful. 

A score for each group was determined for both 
clusters of performance variables by summing the four 
scores on the defining variables. All variables were 
equally weighted except for trips and trials which, be- 
cause of its substantially lower domain validity, was 
weighted only half. In the following analysis these two 
domain scores, productivity and carefulness, will be 
related to scores on the inventory used for measuring 
supervisory ability and approach to decision making. 

From the observational data recorded by the experi- 
menter while the group was performing the task, it was 
possible to obtain two more scores for each group, the 
number of changes in organization and the number of 
changes in method. An organizational change was 
defined by (a) any trading of tasks from one member to 
another, or (6) any change in the way in which the work 
was divided. An example of the first type would be if 
Ss A and B were operating block controls and C 
operated switches, then C trades with B, so that now 
C operates blocks and B operates switches. An example 
of the second type would be if A and B are operating 
block controls and C operates switches, then B stops 
operating blocks and begins to direct the operations of 
A and C. 

There were two main ways to approach the assigned 
task of getting the trains around the track. One was to 
run both trains simultaneously, using the sidings to 
pass, and the other was to run only one train at a time 
either for one trip or several, stopping it at a siding, 
then running the other train. The first was termed the 
simultaneous solution, the second, the successive solu- 
tion. A change in methods was scored whenever a group 
changed from one method to the other. Split-half 
reliability coefficients were computed for these two 
types of changes, between odd and even trials. The 
corrected coefficient of reliability for organizational 
changes was .828, and for methods changes, .690. 


Measurement of Predictor Variables 


Ghiselli (1) has devised a Self-Description Inventory 
consisting of 64 personally descriptive adjectives pre- 
sented in forced choices. This inventory yields several 
scales, one of which is a scale of supervisory ability. 
This scale was validated by comparing the scores earned 
by persons holding supervisory positions in lower and 
middle management positions with ratings of their 
effectiveness as supervisors made by their superiors. 
The scale, then, would seem to measure the type of 
supervisory ability needed for the present investigation. 

Porter and Ghiselli (2) compared the responses to 
the items of the Self-Description Inventory made by 
persons in top management and in middle management. 
Examination of the adjectives checked by each group 
revealed that people holding top management positions 
tended to perceive themselves as being active and self- 
reliant, and willing to take action on the basis of confi- 
dence in themselves and their own abilities. In social 
relations they are equally confident and see themselves 
as straightforward and dignified. People holding middle 
management positions tend to see themselves as careful 
in planning and thoughtful in actions, seldom making 
rash decisions. They seem to wish to avoid giving the 
appearance of being controversial personalities and of 
exhibiting self-centered behavior. Taking the 21 items 
on the inventory that Porter and Ghiselli found to 
differentiate top from middle management personnel 
and give the foregoing self perceptions, a scale was 
formed which we have here termed Decision-Making 
Approach (DMA). This scale probab!y measures more 
than just approach to decision making as we have 
defined it here. However, it is the best available to 
measure the quality in which we are interested and at 
least reflects it as indicated by the individual’s self 
perceptions. 

For the Ss used in the present investigation, the 
coefficient of correlation between scores on the two 
scales was .365. Therefore, it is apparent that while 
they measure certain common traits, the overlap is not 
great. In 20 of the 30 groups the same individual earned 
the highest scores on the two scales. 

In order to examine the problems outlined earlier, 
three scores were computed for each of the 30 groups on 
the two scales, supervisory ability and decision making. 
First of all, for each group the scores of all members 
were averaged. These mean scores give the general level 
of the group on the traits measured. Secondly, in each 
group the highest score earned by any of the members 
was noted. Finally, the score earned by the second 
highest person was noted and the difference between it 
and the highest score was taken as an index of the 
extent to which the highest scoring member in each 
group was uncontested. 

Finally, since we wished to examine the importance 
of a group having an uncontested person with the re- 
mainder of the group being homogeneous in the trait, a 
separate score was developed for the twenty groups 
containing three and four members. Two indices 
entered into this score, the first being the difference 
score just described, and the second being the range of 
scores of the two members of three-person groups and 
the three members of the four-person groups earning 
the lowest scores. For this purpose both difference and 
range scores were transmuted into standard scores and 
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for each group the latter was subtracted from the 
former. As an example of a group earning a high com- 
posite score, the highest scoring individual might have 
a raw score of 28 on the DMA scale; the next highest 
man in the group might have a raw score of 16. This 
difference score would then be 12, corresponding to a 
standard difference score of 65. The lowest three scores 
in the group might be 11, 13, and 16. The range of these 
three numbers is 5, corresponding to a standardized 
range score of 35. The composite score then is 65 minus 
35, or 30. As an example of a group with a low com- 
posite score, the highest raw score might be 20 and the 
next highest 19; the difference score here is 1, corre- 
sponding to a standard difference score of 30. The lowest 
three scores might be 12, 17, and 19; the range of these 
three scores is 7, corresponding to a standard range 
score of 45. The composite score is then 30 minus 45, or 
minus 15. It will be noted that the mean amount of 
the trait possessed by the group is the same in both 
cases, namely, 17. The first case represents substantial 
positive skewness, the second, substantial negative 
skewness. Since the numbers of cases in these groups is 
so small, it was not felt justified to use precise measures 
of skewness such as the third moment. Nevertheless, 
for purposes of communication this composite score 
has been termed degree of positive skewness. 


RESULTS 


The coefficients of correlation between the 
various group scores and the two measures of 
performance, productivity and carefulness, 
are shown in Table 3. The coefficients of 
correlation that are reliability different from 
zero are indicated by means of asterisks. 

For the supervisory scale, there are no 
significant relationships between either of the 
measures of group performance and mean 
scores of the entire group or the highest score 
in the group. These results mean that whether 


TABLE 3 
COEFFICIENTS OF CORRELATION BETWEEN THE VARIOUS 
Group ScoRES AND THE PRODUCTIVITY AND 





ak ease 
tivity Carefulness 





Supervisory Scale 
Mean score of the entire group .071 
Highest score in the group 151 
Difference between the two highest .436°* 

scores 
Positive skewness of group scores* 

Approach to Decision Making Scale 
Mean score of the entire group .165 
Highest score in the group . 265 
Difference between the two highest .435°* 

scores 
Positive skewness of group scores* 


454° 


-815°** 





*p < .05. 
** p< .02. 
eb < .001. 
* For groups containing 3 and 4 members only 


a group does or does not possess a person who 
stands high in the trait are not significant fac- 
tors in its performance. 

The results also indicate that the greater 
tendency for a group to have an individual who 
is uncontested in supervisory ability, the 
greater is the productivity of the group. 
However, the relative homogeneity or hetero- 
geneity of the other members of the group 
apparently is of little importance since the 
coefficient of correlation for difference scores 
and of positive skewness scores, which include 
measures of both the difference score and 
relative homogeneity, are very much the same. 

While none of the coefficients of correlation 
between any of the supervisory scale scores are 
related to carefulness of performance with 
usually accepted levels of statistical signifi- 
cance, it is interesting to note that they are all 
negative. This finding suggests that supervi- 
sory ability as gauged by the governing bodies 
of organizations is oriented toward production 
but not toward quality of performance. 

The mean scores earned by the groups on 
the DMA scale are not significantly related to 
productivity but are to carefulness. By and 
large, those groups containing persons who 
tend to the dynamic type of decision making 
are less likely to be careful than those contain- 
ing persons who tend to decide only after 
careful examination of the situation. 

Whether or not the group contains a person 
who stands high in approach to decision 
making, that is, tends to be self-sufficient in 
decision making, is not significantly related 
to group performance. However, if the group 
possesses such a person and his position in the 
group on this trait is relatively uncontested, 
then the group is likely to be superior in pro- 
ductivity. This relationship kas statistical 
significance. 

When the group possesses an individual who 
is uncontested in this trait and the remaining 
members of the group are homogeneous with 
respect to it, then there is a very high degree 
of likelihood that the productivity of the group 
will be high. While the relationship with the 
criterion of carefulness is not statistically 
significant it is interesting to note that it, too, 
is positive. In fact, this is the only positive 
relationship between any group score and 
carefulness. 

Fig. 1 shows the relationships between the 
DMA positive skewness score, the numbers of 
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Scale Score 


82 


Organizational Variable 


Productivity 


Productivity Criterion 





Positive Skewness, |- 
DMA Scale 


4: 


— .47 No. of Organization Changes 


— .60 No. of Methods Changes 








—.64 


Score 


— .55 


Fic. 1. RELATIONSHIPS BETWEEN SCALE SCORES, ORGANIZATIONAL VARIABLES, AND PRODUCTIVITY 


organizational and methods changes, and the 
productivity criterion. The numbers of changes 
are negatively related both to the criterion 
and to the DMA positive skewness score. This 
result suggests that a large number of these 
changes leads to poor productivity, and that 
the groups with high positive skewness on the 
DMA scale tend to change organization and 
methods less frequently than negatively skew 
groups. All correlations shown in Fig. 1 are 
significantly different from zero beyond the 
.01 level of confidence. 


DISCUSSION 


There seem to be two dimensions in the task 
of operating the model railroad used in this 
experiment, namely, productivity and care- 
fulness. Of the two, the predictor measures 
used were related much more closely to pro- 
ductivity than to carefulness. Furthermore, it 
appears that neither group means nor the 
highest score in the group on the scales used are 
very effective predictors of productivity. 
Measures of the skewness of groups’ scores 
seem to be highly related to productivity, 
more so for the decision-making approach 
(DMA) scale than for the supervisory scale. 
In brief, it seems that the characteristics of 
the distribution of these managerial traits in 
the group are more important in determining 
its performance than any one measure of 
central tendency or extremity. Possible foun- 
dations of these findings need to be examined. 

It seems to the authors that since the DMA 
skewness score describes a particular char- 
acteristic of the distribution of traits in the 
group considered as a whole, one place to look 
for a basis for its relationship to productivity 
might be in the organizational behavior re- 
vealed by the group at work. Fig. 1 suggests a 
promising approach to this problem. By and 
large, groups high on the DMA scale (positive 
skewness) went through far fewer organiza- 
tional and methods changes than those which 


were low (negative skewness), and the number 
of these changes in turn is related to produc- 
tivity. 

It is not difficult to imagine why the number 
of organization and methods changes affects 
productivity. Each time a change occurs, some 
or all members of the group must learn a new 
set of responses to a new set of relevant 
stimuli. They must learn what new things to 
look for and what to do about them. Where 
there is a division of labor in the group, and 
this was necessary in order to perform the 
task, each change of organization or method 
involves such relearning, and the more changes 
there are, of course, the less opportunity the 
group has to learn any one method or set of 
tasks thoroughly. As to why the DMA positive 
skewness scale is related to the number of 
changes, only speculation is possible at this 
time. It may be that in groups with high 
positive skewness, the flow of suggestions com- 
ing from the “idea man” can be effectively 
put into practice or rejected by the two or 
three others who are “cautious evaluators.” 
In groups low or negatively skewed, on the 
other hand, there may be a great many sugges- 
tions which are ineffectively evaluated, thus 
leading to many changes in organization. An 
analysis of the verbal interaction of the group 
might go far in answering some of these 
questions. 

It must be emphasized that the sampling 
generality of these results is limited. The 
population sampled was undergraduate college 
males. They had never seen each other before 
coming into the experimental room, nor had 
they ever performed this particular task. A 
separate analysis by the authors' showed that 
a great deal of learning took place during the 
two hours spent on the task. Examination of 
the number of organization and methods 
changes showed that in the first trials the num- 
ber of changes was low, increased steeply up 


1 Unpublished manuscript. 
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to the sixth three-minute trial, then slowly 
decreased. Similar results may not be found 
with other work groups where the organiza- 
tion has attained stability and methods do 
not change a great deal. Many formal organiza- 
tions have a fairly slow turnover of people, 
so that at any given time, most of the mem- 
bers will know each other fairly well. Also, 
responsibility for developing new methods of 
performing a task usually is not given to the 
same people who perform the task. Finally, 
the task used in this study required close 
cooperation among all members in order for 
them to produce anything, whereas in many 
organizations which perform tasks, the group 
productivity may be merely the sum of the 
production of individuals. 


SUMMARY 


The relationship of the distribution of 
managerial traits within a group to produc- 
tivity in a complex cooperative task was in- 
vestigated, using groups of two, three, and 
four people. Two managerial traits, supervisory 
ability and decision-making approach, were 


measured by means of self-descriptions ob- 
tained from a forced-choice adjective check- 
list. Various combinations of these scores were 
formed for each group, and these were corre- 
lated with criteria of productivity and careful- 
ness, the criteria being weighted composites 


derived through cluster analysis of various 
performance measures. It was found that nei- 
ther the average amount of the trait possessed 
by the group nor the amount possessed by the 
highest scorer was highly related to the 
criteria. Two measures of skewness used, 
namely, the difference between the highest and 
next highest scorer, and this difference minus 
the range of the lower scorers, were highly 
related to the criterion of productivity, a 
correlation of .82 having been obtained be- 
tween the criterion and the positive skewness 
of the group on the decision-making approach 
(DMA) scale. It is suggested that part of the 
relationship may be based on the fact that 
DMA positive skewness is negatively related 
to the number of organizational and methods 
changes occurring in the group, and the num- 
ber of such changes is in turn negatively re- 
lated to the productivity criterion. 
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THEMATIC APPERCEPTION TEST: AN EMPIRICAL EXAMINATION 
OF SOME INDICES OF HOMOSEXUALITY? 
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ost attempts to introduce systematic 
knowledge and precision of statement 

into the use of projective techniques 

are frustrated by an inability to identify a 
firm basis from which to initiate investigation. 
Each conceivable empirical problem immedi- 
ately calls to mind a host of other problems 
that seem at least as significant and of prior 
importance. Even attempts to explore very 
elementary questions characteristically raise 
the shadows of dozens of other issues, the 
resolution of which seems necessary before the 
first question can be approached meaningfully. 
Under such circumstances, one defensible 
tactic is simply to begin with a careful attempt 
to state explicitly the present status of projec- 
tive testing or a particular projective tech- 
nique, and then to examine piecemeal the accu- 
racy or utility of the various elements in this 
position. This paper represents one part of 
such an approach to the study of the Thematic 
Apperception Test. We began with a statement 
of assumptions that underlie the conventional 
use of the test (5) and followed this with the 
identification of well over 500 generalizations 
that related specific aspects of Thematic 
Apperception Test response to some char- 
acteristic or attribute of the story teller (8). 
These statements were collated from the 
published literature and represent the efforts 
of clinicians and investigators to summarize 
and generalize their experience with the test. 
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Dagny Kalnins and Philip Bossart in conducting these 
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Given this list, we then set out to examine the 
empirical utility of significant clusters of the 
generalizations (6, 7). The present paper re- 
ports the results of several studies that ex- 
amined the potential utility of existing indices 
of homosexuality. 

We found initially a total of nine different 
generalizations concerning the relation between 
homosexuality and aspects of TAT response. 
Having identified these, we proceeded to 
develop a scoring procedure for each of them 
that would permit us to analyze TAT protocols 
objectively. We then related the resultant 
scores for a group of intensively studied sub- 
jects to various independent measures of 
sexuality. Further, we applied our derived 
scoring procedure together with some addi- 
tional variables to the TAT protocols of a 
group of known, overt homosexuals as well as 
to the protocols of a normal control group. 


RELATION BeETweEN TAT INDICES oF 
HOMOSEXUALITY AND RATINGS OF 
SEXUALITY 

Our initial inquiry into the utility of the 
indices of homosexuality examined the associa- 
tion between these variables and several sets of 
ratings that were designed to assess the sexual 
motives of a relatively normal group of sub- 
jects. 


Procedure 


Subjects. The Ss were 20 undergraduates of 
Harvard College who had volunteered from a 
large introductory psychology course to parti- 
cipate in an intensive program of personality 
study and who were paid at the customary 
rates for student employment. They were 
selected so as to be representative of their 
college class in terms of such variables as 
religion, socioeconomic status, academic stand- 
ing, and extracurricular activities. 
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Ratings. The Ss were studied over a two-year 
period by a team of roughly a dozen psycholo- 
gists, using a wide variety of techniques that 
included virtually all of the conventional 
devices for assessing personality in addition to 
a number of newly devised techniques. Among 
these instruments were projective techniques, 
inventories, observational methods, auto- 
biographical report, interviews, and various 
special ratings. At the end of the period of 
study, each S was discussed at length in a 
diagnostic council involving the various 
specialists participating in the study. Upon 
conclusion of this session, a general formula- 
tion of the individual’s personality was agreed 
upon, and a number of ratings on specified 
variables were assigned. One of these ratings 
focused upon the variable of homosexuality. 
The judgment was made in terms of a six-point 
scale with the intention of reflecting the im- 
portance of this motive as a determinant of the 
S’s behavior. Inasmuch as a great deal of the 
information that the diagnostic council had at 
its disposal was presumed to bear upon covert 
or latent aspects of personality, it was natural 
that the ratings emphasized latent homo- 
sexuality. It was generally understood for all 
of the variables rated that if an individual 
showed strong and unmistakable evidence of 
the unconscious or covert operation of the mo- 
tive, he would receive a high rating on the 
variable even though this motive did not re- 
ceive direct expression in overt behavior. 
These ratings are referred to as diagnostic 
council ratings. 

The fact that the diagnostic council judges 
were familiar with the TAT responses of the 
Ss representsa possible source of contamination. 
The same factors that lead to high scores on the 
TAT indices of homosexuality might be 
attended to by the judges and thus lead them 
to equally high diagnostic council ratings on 
this variable. Actually, the judges did not have 
the results of any of these ratings nor were 
they specifically acquainted with the generali- 
zations. Further, the amount of material that 
they had to consider and digest in arriving at 
their decisions made it unlikely that there 
would be any serious bias introduced by this 
factor. Finally, in two other studies (6, 7) it 
was demonstrated quite conclusively that the 
specific TAT indices were not mirrored in or 
important influences on the ratings of the 
diagnostic council. 





In addition to the ratings of homosexuality, 
we also had available for 15 of the Ss a set of 
self ratings concerned with heterosexual activ- 
ity and needs. These ratings, also made on a 
six-point scale, reflected the S’s judgment of 
how strong was his need for heterosexual 
expression in comparison to other individuals 
of his age and general background. Finally, 
for 18 of the Ss, we had a set of observer ratings 
of the same variable, made by an experienced 
clinical psychologist on the basis of a detailed 
autobiography and a fact-finding interview. 

Thematic Apperception Test. The TAT was 
administered individually under standard con- 
ditions with the examiner taking notes and at 
the same time obtaining an electrical transcrip- 
tion of the stories. The two sources of informa- 
tion provided a virtually verbatim record of 
the S’s speech. 

The stories were analyzed in terms of nine 
variables that had been derived from the col- 
lected list of empirical generalizations. In each 
case, we began with the published statement of 
the variable and then attempted to objectify 
the statement through the use of specific 
rules that would permit a quantitative out- 
come. In describing these nine variables we 
shall present a brief quotation that identifies 
the source of the variable, as well as its general 
content, and indicate the kind of rating scale 
that was used in analyzing the stories. 


Discomfort on Card 9BM. “Card 9 is always signifi- 
cant because the picture of the men lying on the ground 
will make a male patient uncomfortable if he feels any 
attraction for other men” (1, p. 106). The scoring was 
dichotomous, with discomfort inferred from such indices 
as incidental remarks by the S indicating disturbance 
or an inability to think of a story, hesitation, and long 
pauses before beginning the story, extreme shortness of 
the story in relation to the typical length of the S’s 
stories. 

Hypnotism on Card 12BM. “Stories to #12 in which 
the young man on the couch has given himself up to be 
hypnotized by the older man or in which the older man 
has forcibly hypnotized the young man .. . frequently 
reflect the patient’s latent homosexual tendencies or 
overt homosexual experience” (11, p 46). Dichotomous 
scoring. 

Misrecognition of Sex. “Latent homosexuality or a 
condition where a patient is in conflict concerning 
homosexua! urges may express itself in distortions of the 
sex of the character in the pictures” (10, p. 9C). Di- 
chotomous scoring. 

Altack From the Rear on Card 18BM. “Stories to # 18 
in which the hero has been attacked from the rear or 
pulled to the rear, frequently reflect the patient’s 
latent homosexual tendencies or overt homosexual ex- 
perience” (11, p. 46). Dichotomous scoring. 
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Feminine Identification. “Latent homosexuality or a 
condition where the patient is in conflict concerning 
homosexual urges may express itself in identification 
with female characters” (10, p. 90). Dichotomous 
scoring. 

Attitude Toward Marriage. “Latent homosexuality or 
a condition where a patient is in conflict concerning 
homosexual urges ... may also be discovered by basic 
attitudes toward marriage and the opposite sex which 
will occur with some frequency” (10, p. 90). Rated on a 
five-point scale. 

Man Killing W oman. “Murray stated that men who 
produce stories in which men kill women are likely to 
have conflict over strong homosexual tendencies” (9, 
p. 114). Scored as present or absent with a borderline 
category for cases where a male has indirect responsi- 
bility for killing a woman. 

Sexual References. “In manifest homosexuals I have 
noticed a tendency toward frequent manifest sexual 
references” (2, p. 216). Rated on a four-point scale. 

Unstable Identification. “In manifest homosexuals I 
have noticed a tendency toward a recurrent shift of 
identification not only from males to females but also 
within one sex” (2, p. 216). Rated on a three-point 
scale. 


For those variables that showed sufficient 
range, product-moment correlations were com- 
puted to estimate the degree of association 
between the experimental ratings and a set of 


independently performed reliability ratings. 
Three of these variables showed relatively high 
reliability (Attitude Toward Marriage, .83; 
Men Killing Women, .89; Sexual References, 
.88) while two showed only moderate relia- 
bility (Feminine Identification, .68; Unstable 
Identification, .63). The remaining four vari- 
ables did not show sufficient range of scores to 
permit the usual correlation approach, but the 
use of a chi-square measure of association 
showed a highly significant covariance between 
the two sets of ratings. 

Analysis of data. In relating the TAT indices 
to the various ratings, the product-moment 
correlation was employed wherever the distri- 
bution of the data permitted; in all other in- 
stances, the biserial correlation was used. In 
each case, the direction of the relationship was 
predicted in advance, and for this reason a 
one-tailed test of significance was used con- 
sistently. 


Results 


It was our general expectation that the TAT 
signs would show the highest degree of associ- 
ation with the diagnostic council ratings of 
latent homosexuality and the least association 
with the self ratings of heterosexuality. This 


TABLE 1 


RELATION BETWEEN RatTinGs AND TAT Inpices oF 
HoMOSEXUALITY 








Ratings 





ic) Observer 








. Discomfort on Card 9BM 

. Hypnotism on Card 
12BM 

. Misrecognition of Sex 

. Attack From the Rear on 
Card 18BM 

. Feminine Identification 

. Attitude Toward Mar- |. 


rage 
. Man Killing Woman 
. Sexual References 
. Unstable Identification -22 























* Significant at the .05 level of significance. 


prediction was based on the assumption that 
the outcome of the complex diagnostic council 
procedure was a more accurate measure of the 
S’s latent motivation than the judgments of 
the single observer, which in turn should be 
superior to the self ratings. We expected, of 
course, that the ratings of latent homosexuality 
would be positively related to our TAT indices, 
while the ratings of heterosexuality would be 
negatively related to these indices. It is appar- 
ent that the above reasoning is grossly over- 
simplified, as there are many complicating fact- 
ors that would be expected to operate in this 
situation which we have not considered. 
Nevertheless, in advance of the study this 
seemed the most defensible general prediction 
that could be made. 

The results of our analysis are summarized 
in Table 1, where we find a weak confirmation 
of our expectation that the TAT indices would 
show evidence of positive association with the 
diagnostic council ratings of homosexuality. 
Of the nine correlations, six are positive as 
predicted (although two of these are of negli- 
gible magnitude), and two attain significance 
at the .05 level. None of the three negative 
correlations attain conventional significance. 
The observer ratings clearly provide an even 
less impressive correlational picture. Only 
two of the nine correlations are in the predicted 
direction, and the single coefficient that attains 
significance is not one of these. Therefore, on 
grounds of number of significant correlations, 
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as well as on grounds of the general tendency 
of the data to fall in predicted directions, the 
observer ratings seem less intimately related 
to the TAT indices than are the diagnostic 
council ratings, as we had predicted. 

There are no significant correlations between 
the self ratings and the various TAT indices, 
but there is more consistency in direction here 
than in either of the other cases. Seven of the 
nine correlations are in the predicted direction, 
and four of these are above .30 as compared to 
only two of the correctly predicted relations 
involving the council ratings. Thus, our hy- 
pothesis that the self ratings would show the 
least evidence of association with the TAT 
indices does not appear to be confirmed. 

Although these relations are sufficiently 
slight so that it is difficult to assert anything 
with conviction, it seems clear that the vari- 
ables concerned with Feminine Identification 
and Sexual References merit further investiga- 
tion. Worth noting is the fact that three of the 
indices of homosexuality most frequently en- 
countered among users of the TAT (Discom- 
fort on Card 9BM, Hypnotism on Card 12BM, 
Attack From the Rear on Card 18BM) actu- 
ally show a negative association with the 
council ratings. There is no evidence here to 
* support custom! This observation, when 
coupled with the finding that the same three 
variables were related to the self ratings in the 
predicted direction, is reminiscent of the re- 
sults of a similar study (7) where it was dis- 
covered that TAT indices of aggression were 
most intimately related to self ratings and least 
closely associated with diagnostic council 
ratings. 

In this earlier study it was found that a 
general clinical rating, based on the TAT pro- 
tocols with no other knowledge of the S and 
with no attempt at objectivity and careful 
quantification, led to a correlation with the 
council ratings that was as high as the best of 
the TAT indices. Virtually the same finding 
was unearthed in the present study. When 
general, clinical ratings of homosexuality, 
made by the individual who administered the 
test and with no further information con- 
cerning the S, were related to the council 
ratings, they showed a product-moment corre- 
lation of .47. This is approximately the same 
as the highest of the correlations obtained with 
the carefully scored TAT indices. Thus, the 
painstaking efforts at quantification seem to 


have produced an end result which, on the 
average, is inferior to the easy, general rating. 

Before considering any further implications 
of these findings, let us turn to a second study 
that employed many of these same TAT in- 
dices and in this case scrutinized them against 
a somewhat more sharply differentiated crite- 
rion than that used in the present study. 


DIFFERENCES BETWEEN OVERT HOMOSEXUALS 
AND NorMAL Susjyects on TAT 
INDICES OF HOMOSEXUALITY 


© Having examined the relation between our 
TAT indices and various ratings of sexuality 
for a normal group of Ss, the question next 
arose as to how well these indices would 
function when examined in connection with a 
broader range of sexual behavior. In order to 
answer this question, we obtained a set of TAT 
protocols from a group of homosexual Ss and 
a control set from a group of normal Ss and 
compared the performance of the two groups 
on all of the variables used in the previous 
study that could be applied to the present 
protocols. We also added to our comparison a 
number of variables that we had encountered 
in the literature subsequent to the first study, 
and we devised certain new variables in the 
expectation that they might differentiate the 
two groups. Finally, we classified the two 
groups of Ss on the basis of a clinical appraisal 
of the TAT stories, making no effort to provide 
ratings or specific justification for these deci- 
sions. 


Procedure 


Subjects. The Ss consisted of 20 under- 
graduate males who had admitted to overt 
homosexual acts and 20 undergraduate males 
who to the best of our knowledge had never 
committed a homosexual act. None of the Ss 
displayed any gross evidence of psychopathol- 
ogy, and most of the homosexual group had 
accepted their sexual deviation and were not 
receiving psychotherapy. The two groups 
were matched in sex, age, and educational level. 
The reader should note that a small number of 
our homosexual Ss were also included in a 
study reported by Davids, McArthur, and 
Joelson (3); and consequently, the findings of 
these two studies can not be considered cumu- 
lative or to be independent estimates of the 
actual relationships existing. 
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Thematic Apperception Test. A shortened 
form of the TAT, consisting of Cards 4, 6BM, 
7BM, 10, and 18BM, was administered indi- 
vidually by a male administrator. The proto- 
cols were typed uniformly so that the rater 
was unaware of the identity of any of the Ss 
when analyzing the stories. The protocols 
were then scored for the following variables 
taken from the list used in the previous study: 
Misrecognition of Sex, Attack From the Rear 
on Card 18BM, Feminine Identification, Atti- 
tude Toward Marriage, Man Killing Woman, 
Sexual References, and Unstable Identifica- 
tion. In a few cases, the scoring system was 
slightly modified, but in general these stories 
were scored in very much the same manner as 
the first set. The first two variables of the 
previous study (Discomfort on Card 9BM and 
Hypnotism on Card 12BM) could not be used 
as they are linked to specific TAT cards not 
employed in this study. However, a number 
of additional variables were examined, and 
these are briefly described below together with 
an indication of the source of the variable and 
the general form of the rating. 


Representing Feminine Feelings and Emotions. “... 
evidence of his having experienced many things in a 
‘feminine’ way himself. . . ability to create a picture of 
(a woman’s) yearnings and feelings (card 10)...” (4, 
p. 215). Rated on a five-point scale. 

Shallow Heterosexual Relations. “... patients who 
are under the pressure of strong homosexual needs, 
usually paranoids, and neurotic characters who cannot 
conceive of a genuine tender relationship between any 
two persons... think that the scene (Picture 10) is a 
sham, that there is really no reciprocal affection” (4, 
p. 216). Scored on a five-point scale ranging from posi- 
tive and satisfying relation to relation lacking in love or 
affection. 

Male Embrace on Card 10.“. . . a strong clue to latent 
homosexuality or even manifest problems of this 
nature. ...If this (Picture #10) is interpreted as an 
embrace between males by a male subject” (2, p. 210). 
Scored dichotomously. 

Attitude Toward Opposite Sex. In the first study, 
attitudes toward marriage and the opposite sex were 
scored as a single variable, whereas in the present study 
these were separated and scored individually on a five- 
point scale ranging from very positive to very negative. 

Tragic Heterosexual Relations. “The homosexuals 
gave several stories in which a heterosexual relationship 
was followed by tragedy, usually the violent and sudden 
death of one member of the couple’’ (3, p. 168). Scored 
dichotomously. 

Attachment to Father. “The first sign (of homo- 
sexuality) ... applies to stories containing a strong un- 
resolved attachment to a father or father figure composed 
for card 8” (3, p. 168). There was a typographical error 
in the original article, and this sign should be referred to 


Card 7BM. In the present study, this variable was 
scored dichotomously for Card 7BM and Card 18BM. 

Derogatory Sexual Terms Applied to Women, “Al- 
though the homosexuals did not often produce stories in 
which a clearly derogatory attitude toward women was 
revealed, use of certain sexual derogaiory terms in relation 
to women gave specific clues to their feelings toward 
female characters in their stories” (3, p. 168). Scored 
dichotomously. 

Attachment to Mother. “Another sign (of homo- 
sexuality), associated with Card 6, applied to stories 
involving a strong unresolved attachment to the mother” 
(3, p. 168). This variable was scored dichotomously for 
both Cards 6BM and 10BM. 

Symbolism or Allegory in Response to Card 18BM. 
“...in response to Card 18, several homosexuals told 
stories that describe the plight of the young man alle- 
gorically or symbolically...” (3, p. 168). Scored di- 
chotomously. 


In addition to these variables, certain further 
indices were formulated in the expectation that 
they might reveal differences between the two 


groups. 


Incest. This variable was scored as present or absent, 
with presence being assigned to any kind of sexual re- 
lation among members of the same family. 

Manifest Homosexuality. Scored dichotomously with 
homosexual act, thought, impulse, or even use of the 
word being scored as presence of the variable. 

Perception of Card 10 as Other Than Elderly Couple. 
Scored dichotomously. 

Introduction of Female in Positive Context on Card 
1J8BM. In this case, it was expected that the normal 
group would show the higher incidence of the sign which 
was scored as present or absent. 


Reliability. In order to assess the extent to 
which these indices could be scored reliably, a 
second rater with no shared practice with 
the experimental rater performed the same 
ratings on 40 stories selected so that there was 
one story from each of the Ss and an equal 
number of stories told to each of the pictures 
employed in this study. The relation between 
the scores of the two raters is summarized in 
Table 2. Where the data permitted, product- 
moment correlations were computed, and 
where the range of scores was insufficient to 
permit this, tetrachoric correlations were 
computed. In those instances where there were 
too few entries in one or more of the cells to 
permit a reasonable use of the tetrachoric 
correlation technique, we have simply reported 
the percent of complete agreement between the 
two raters. In general, the results are highly 
satisfactory. The lowest Pearson r is .71, the 
lowest tetrachoric correlation coefficient is 
.81, and the lowest percentage of agreement is 
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TABLE 2 
INTERRATER RELIABILITY OF TAT INDICES 
(mn = 40) 








Variable 


Misrecognition of Sex 

18BM: Attack From the Rear 

Feminine Identification 

Attitude Toward Marriage 

Man Killing Woman 

Sexual References 

Unstable Identification 

Feminine Feelings, Emotions 

Shallow Heterosexual Relations 

Male Embrace 

Attitude Toward Opposite Sex 

Tragic Heterosexual Relations 

Attachment to Mother 

18BM: Symbolism or Allegory 

Attachment to Father 

Derogatory Sexual Terms Applied to 
Women | 

Homosexual] Content 

Incest 

10BM: No Elderly Couple 

18BM: Positive Introduction of Female 


83. Considering the fact that this degree of 
agreement was achieved without any shared 
experience on the part of the two raters, the 
results compare very favorably with those 
reported by other investigators. 

Analysis of data. In the analysis of these 
data, the differences between the two groups 
were tested through the use of conventional 
i tests where the data permitted. In those 
cases where the data departed so far from 
normality as to prohibit this, we simply 
counted the occurrence of the index among 
the Ss and tested the significance of the differ- 
ence between the resulting proportions of oc- 
currence of the index in the two groups. In a 
few cases, the theoretical frequency in one or 
more of the cells was less than five, and in these 
cases Fisher’s Exact Test was employed. A 
one-tailed test of significance was used con- 
sistently because we predicted the direction 
of the relationship in every case. 


Results 


The differ aces between the homosexual and 
normal groups for the 20 variables are sum- 
marized in Table 3. It is clear that the data 
generally tend to bear out our prior expecta- 
tion as only 3 of the 20 differences reverse the 
predicted direction. Moreover, a considerable 
number (nine) of the differences that conform 


to our prediction attain significance at the .05 
level of significance or better, while none of 
the relations reversing our prediction attained 
conventional significance. 

The nine variables that successfully differ- 
entiated between the criterion groups were 
Shallow Heterosexual Relations, Feminine 
Identification, Unstable Identification, Atti- 
tude Toward Marriage, Man Killing Woman, 
Use of Symbolism or Allegory on Card 18BM, 
Derogatory Sexual Terms Applied to Women, 
Homosexual Content, and Introduction of a 
Woman in a Positive Context on Card 18BM. 
In general, these variables yield a consistent 
picture of negative or hostile sentiments to- 
ward members of the opposite sex and psycho- 
logical femininity. 

It is immediately clear that our variables 
are more intimately related to the distinction 
between overt homosexuals and normal Ss 
than they are to the various independent rat- 
ings of sexuality. Such a finding could reflect 
either the infirmity of our ratings, the greater 
range of sexual behavior involved in the sec- 
ond study, or the tendency of these signs to 
discriminate most effectively overt manifesta- 
tions of the motive. In a study of indices of 
aggression (7), there were quite clear-cut 
findings indicating that the TAT indices were 
more closely related to conscious than to 
latent or unconscious aspects. That the case 
is not quite so simple in the present inquiry is 
indicated by the failure in the rating study to 
find any clear evidence of a closer relation 
between the TAT indices and the self ratings 
than between the TAT indices and the diag- 
nostic council ratings. 

When the two variables that were signifi- 
cantly related to the diagnostic council ratings 
in the first study are examined in the present 
context, we find that both show some evidence 
of differentiating the groups in the predicted 
direction. This trend is very slight and non- 
significant in the case of Sexual References but 
is highly significant for the variable of Femi- 
nine Identification 

What of the remaining variables that were 
employed in both studies? The evidence is 
completely consistent in regard to the lack of 
utility of Misrecognition of Sex. The studies 
agree in suggesting the relative promise of 
Unstable Identification and Man Killing 
Woman. The findings are relatively in- 
determinate in both cases for the variable of 
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TABLE 3 
DrMeNSIONAL TAT ComPaARISON OF NoRMAL AND HomosEXvAL SUBJECTS 








Normal 


Homosexual 
Sz 





Variable 


Xs Freq. 


Xs Freq. 





Misrecognition of Sex ) 

18BM: Attack From the Rear 4 

Feminine Identification 

Attitude Toward Marriage 

Man Killing Woman 

Sexual References 

Unstable Identification 

Feminine Feelings, Emotions* 

Shallow Heterosexual Relations 

Male Embrace 

Attitude Toward Opposite Sex 

Tragic Sueueneall Relations 

Attachment to Mother 

18BM: Symbolism or Allegory 

Attachment to Father* 

Derogatory Sexual Terms Applied to 
Women 

Homosexual Content 

incest 

10BM: No Elderly Couple* 

18BM: Positive Introduction of Female 


AnAoe OuMOonNnN OC 


3 
9 


— 


<.30 


.46 ; 
.02 


me 
ORR A SRA 





*These are the only variables where the difference was sof in the predicted direction and consequently the only variables 


where a two-tailed test of significance was employed. 


** Where there is no value for either ¢ or x* the probability estimate was derived from Fisher's Exact Test. 


Attack From the Rear, while the variable of 
Attitude Toward Marriage functioned well 
in the present study and very poorly in the 
rating study. In summary, the two studies 
produced relatively congruent findings for 
six of the seven overlapping variables. 

These indices of homosexuality have func- 
tioned relatively well against the criterion 
measure. Certainly this is the case when they 
are compared to the effectiveness of indices of 
anxiety and aggression. One may still question 
seriously, however, whether any or all represent 
much of an aid for the clinician or investigator. 
It is evident that the variables that function 
best are precisely those which one would not 
expect to encounter when dealing with strongly 
repressed homosexuality or with an overt 
homosexual who was attempting to conceal 
this aspect of his behavior. Indices having to 
do with overt homosexuality, crude sexual 
references, derogatory remarks concerning 
the other sex, etc., are all readily susceptible 
to conscious censoring. While it remains an 
interesting research question as to what the 
differences are between TAT responses col- 
lected under circumstances where the examiner 
is known to be familiar with the homosexuality 
of the S and under circumstances where the 
examiner is not aware of this, there seems little 


doubt that many of the present characteris- 
tics of the homosexual protocols would be 
minimized or eliminated where the deviance 
of the Ss was less openly maintained. 

A different approach to the clinical utility 
of these indices can be made through com- 
paring the results of our objective analysis 
with the results of a general clinical sorting of 
the 40 protocols. This categorizing was made 
by one of the authors without awareness of the 
identity of any of the Ss but with a general 
familiarity with all the specific variables used 
in the study. He was able to sort the protocols 
with 95 per cent accuracy. In other words, he 
made only one reversal, classifying a single 
homosexual protocol as normal and a single 
normal protocol as homosexual. The accuracy 
was actually even greater than this over-all 
figure implies, because in making these ratings, 
he initially sorted the stories according to his 
certainty or confidence of their classification 
(16 in the confident homosexual group and 13 
in the confident normal group), and all of these 
were identified correctly. Further, the stories 
were ranked in terms of the certainty of the 
judgment so that a rank of 20 would represent 
the set of stories that was least confidently 
classified. The two stories misclassified were 
respectively 19th in the homosexual group and 
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16th in the normal group. In other words, the 
stories were correctly classified in 95 per cent 
of the cases, and the two errors were made on 
judgments held in relatively low confidence by 
the judge. 

Clearly, the uninhibited clinician functions 
better than any one of the objective indices, 
and furthermore he functions better than any 
reasonable combination of these indices. This 
observation is not meant to disparage the in- 
dices but simply to point out again the relative 
inefficiency of customary indices or signs as 
diagnostic tools. 

DISCUSSION 

The results of our two inquiries agree in 
suggesting that those aspects of the TAT that 
have been identified conventionally as resonant 
to homosexuality are approximately as adver- 
tised. That is, there is evidence of association 
in the predicted direction between our TAT 
indices and the criterion measures. The strength 
or magnitude of the observed relations is 
generally not very high, but there can be little 
question that these indices have functioned 
better than equivalent sets of indices for the 
variables of anxiety (6), aggression (7), and 
schizophrenia examined in past studies. 

Our tentative findings suggest that the TAT 
protocols of the individual characterized by 
strong homosexual tendencies display con- 
sistently negative attitudes toward women, a 
lack of full, rich, and satisfying relations with 
members of the opposite sex, and occasional 
reference to manifest homosexuality. In addi- 
tion, there were a few formal differences 
between the stories of the normal and the 
homosexual, such as the tendency on the part 
of homosexual Ss to utilize symbolism or 
allegory on Card 18BM. As we have already 
indicated, there is no certainty that these 
signs will continue to function effectively 
in settings that differ from those employed in 
the present study. Not only does this caution 
apply to differences involving S awareness of 
the examiner’s knowledge of his homosexuality 
but, also, to variation in such characteristics 
as intelligence, educational level, and group 
membership. Thus, the use of allegory or 
symbolism seems almost certain to be a 
function of homosexuality only under circum- 
stances where a certain minimum of intelli- 
gence and education has been attained. 

What of the distinction between latent and 





overt homosexual tendencies? Here our evi- 
dence is rather inconclusive but generally 
points to the rather discouraging conclusion 
that the TAT was most diagnostic of homo- 
sexual tendencies by means of relatively mani- 
fest variables and in relation to an overt 
homosexual criterion. That is, most of the in- 
dices of homosexuality that functioned success- 
fully tended to be rather directly related to 
homosexuality and thus might be expected to 
be readily subject to censoring or inhibition. 
Further, the signs showed a much more con- 
vincing association with the criterion provided 
by the homosexual and normal groups than 
by the diagnostic council ratings of homosex- 
uality. This advantage of the overt homosexual 
criterion could, of course, refiect the relative 
purity or sensitivity of the two criterion 
measures rather than the difference between 
them along the covert-manifest continuum. 
Perhaps the strongest congruence provided 
by our two studies bears upon the comparison 
of general or clinical judgments and the objec- 
tive TAT indices. In both investigations it was 
evident that the uninstructed and unhampered 
clinician could do a much better job of pre- 
dicting the criterion than could the diligent 
rater. When this consistent finding is added toa 
similar observation made in a study.of indices 
of aggression (7), the implication seems un- 
deniable that our difficult and time consuming 
ratings are not functioning at a high level of 
efficiency. It is not surprising that early 
efforts at objectivity and specification should 
lead to some diminution in clinical effective- 
ness, but this finding does make clear the 
limited utility of the TAT indices at present. 


SUMMARY 


We have conducted two studies designed to 
estimate the effectiveness of a variety of TAT 
indices that have been proposed as measures 
of homosexual tendency. In the first study, 
nine TAT indices were scored for 20 subjects 
and related to independent ratings derived 
from a diagnostic council, an objective observer 
and the subject himself. Results indicated 
that the indices appeared more closely related 
to the council ratings than to observer ratings, 
but there was no evidence of any superiority 
on the part of the observer ratings over the 
self ratings. A general clinical rating based on 
the TAT proved to be more closely related to 





TAT Inpices oF HomMosEexvALITy 75 


the diagnostic council rating than any of the 
objective indices. 

In the second study, 20 TAT indices of 
homosexuality were examined for their effec- 
tiveness in differentiating between 20 overt 
homosexual subjects and 20 normals. The 
findings revealed 16 of the signs differentiating 
between the groups in the predicted direction, 
while three reversed the prediction and one 
showed no difference. Nine of the 16 predicted 
differences were significant at the .05 level, 
whereas none of the reversals attained conven- 
tional significance. A judge unacquainted with 
the identity of any subject was able to sort 
the protocols into homosexual and noriaal 
groups with 95 per cent success. This represents 
a much more efficient discrimination than that 
permitted by any of the objective indices. 

Although these indices of homosexuality 
have functioned more effectively than equiv- 
alent indices for other variables, there still 
seems ground for serious doubt concerning 
their utility. Not only does the general clinical 
rating appear to function more effectively, but 
also the nature of the indices implies that they 
could easily be subjected to voluntary distor- 
tion or inhibition, thus minimizing their useful- 
ness in many diagnostic settings. 
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THEMATIC APPERCEPTION TEST: SOME EVIDENCE BEARING ON 
THE “HERO ASSUMPTION”! 
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Syracuse University 


ROM its very moment of origin, the 
Thematic Apperception Test has been 
intimately associated in the minds of 

most users with the assumption that there are 
certain characters in each story that clearly 
reflect attributes of the storyteller, while other 
figures are more revealing of the storyteller’s 
perceptions of the individuals who populate 
his personal world. In an empirical discipline, 
however, the length of time that a belief has 
been held entitles it to no special considera- 
tion, and it is therefore quite natural that this 
assumption should be challenged. 

Objections to the hero assumption were 
particularly prone to occur because of the 
intimate relation between an_ individual’s 
own attributes and that which he perceives 
in the outer world. Thus, it is commonly as- 
sumed, with considerable supporting evidence, 
that an individual who is highly aggressive 
will perceive more hostility in the world around 
him than an individual who is less aggressive. 
This association between internal and external 
tends to disrupt and blur the easy distinction 
implied by the original assumption. It should 
be mentioned that Murray, who initially 
formulated this assumption, was fully aware 
of the complexity of the relation between the 
individual’s inner and outer worlds and never 
intended that the test interpreter should 
slavishly maintain a rigid distinction between 
“hero” and “other.” In fact, the manual 
originally published with the test (12) contains 
a lengthy discussion of various complications 
that under special circumstances make it 
necessary to modify or abandon the assumption 
of a single hero or identification figure. 

Even accepting the fact that ultimate preci- 
sion will surely depend upon some more com- 
plicated set of assumptions than those we are 
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here concerned with, it is still an interesting 
question whether the interpreter of the TAT 
can go further with the assumption of a distinc- 
tion between hero and non-hero than he can 
with the assumption that all figures in the story 
are equally revealing of the “own char- 
acteristics” of the storyteller. Although endless 
rational arguments can be introduced bearing 
on the choice between these assumptions, what 
we need most is not logical inference nor emo- 
tional polemic but controlled empirical evi- 
dence. In this spirit, the present paper is in- 
tended to outline the results of two small 
investigations designed to provide some very 
tentative findings that bear upon the utility of 
employing the distinction between hero figures 
and non-hero figures in interpreting Thematic 
Apperception Test protocols. 

In the first study, we began with the very 
simple notion that if “heroes” were more 
indicative of characteristics of the subject (5S) 
than were “other figures,” the S should per- 
ceive them or react to them differentially. 
Given this reasoning, we proceeded to admin- 
ister the TAT to a number of Ss, conducted 
an individual inquiry in which we tried to as- 
sess the reaction of the storyteller to each of 
the figures in his stories, and then looked for 
differences in reaction to those figures inde- 
pendently rated as identification figures as 
opposed to those that were not rated as identi- 
fication figures. 

In the second study, we were able to ex- 
amine certain quantitative shifts that occurred 
in TAT stories following a frustration experi- 
ence within the framework of the hero: sump- 
tion and within the framework of the assump- 
tion that all figures are equally indicative of 
characteristics of the storyteller. 

It should be clearly understood that in these 
studies we are not asking whether the simple 
assumption of a single hero in each story is the 
most fruitful assumption that can be made. 
What we are asking is whether this is a more 
fruitful assumption than the opposite extreme, 
the assumption that all figures in the story are 
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equally revealing of the storyteller’s charac- 
teristics. There are many fine gradations be- 
tween these extremes, and a shrewd observer 
with a facile pen can complicate the assump- 
tions endlessly. Some of the many alternative 
assumptions have been considered in an earlier 
paper by one of the present authors (6), and his 
presentation has been examined critically by 
Piotrowski in several subsequent papers (13, 
14). What is needed now, however, is not fur- 
ther complexity or rational elaboration but 
rather a statement of assumptions with suffi- 
cient explicitness so that they lead to clear 
empirical consequences, followed by a careful 
testing of these consequences. 

The findings presented here are, at most, a 
beginning on the road to clarifying the kinds of 
underlying processes that operate in the 
construction of imaginative stories. Their sole 
virtue lies in the fact that they makeclear under 
controlled circumstances something about the 
relative merit of two widely divergent and yet 
defensible assumptions concerning the process 
of interpreting imaginative protocols. 


Supyect REACTIONS TO HERO AND Non-HERO 
FIGURES 


If the assumption of a hero in each story 
whose attributes are especially revealing of the 
S’s psychological makeup is to prove defen- 
sible, it seemed to us that we should be able to 
show that the S reacted differently to hero 
figures than he did to those figures judged not 
to be hero figures. In particular, we reasoned 
that if heroes were carriers of storyteller 
attributes, the S should see these figures as 
more similar to himself than the non-hero 
figures, or else he should react with a violent 
denial of any similarity or resemblance between 
himself and the hero figures. This conclusion 
was derived from the assumption that the TAT 
revealed both conscious attributes and uncon- 
scious or unacceptable attributes of the 
S, coupled with the further reasoning that 
when a figure represented a conscious quality 
of the storyteller, he would accept and report 
the similarity, while under circumstances where 
there was an unacceptable impulse or quality 
involved, he would tend to deny strongly any 
similarity between himself and the figure. 

Consequently, the hypothesis to be tested 
in this study asserts that story-characters 
independently judged to be hero figures are 


seen by the S as similar to himself or as having 
no self-similarity whatsoever, while figures 
independently judged mot to be hero figures 
are more often seen by the storyteller as 
resembling others or else as representing 
stereotyped or fictional characters. It seems 
clear that the assumption that all figures are 
equally revealing of the storyteller provides no 
basis for predicting ary difference in S reaction 
to hero and non-hero figures. 


Procedure 


A shortened version of the TAT consisting of 
Cards 2, 5, 7GF, 9GF, 10, and 18GF was administered 
in a small group setting to 30 Syracuse University 
undergraduate females. Use of group administration 
seemed warranted in view of the results of an earlier 
investigation (7) comparing individual and group ad- 
ministration. The Ss had volunteered to participate in 
the study from an introductory course in psychology. 
After completion of the group test, individual appoint- 
ments were mzde for each S within 48 hours from the 
time of the original test. During the individual inter- 
view, each S was asked to tell as much as she could 
concerning the factors that led to her creating each of 
the stories she had constructed. In addition, for each 
character in each of the stories, she was required to 
make a judgment of how similar the character was to 
herself and how similar it was to other people whom 
she had known. The responses of the Ss permitted us to 
categorize each character in terms of similarity to self 
(thought of as self, could be self, some similarity to self, 
denial of similarity to self) and similarity to other 
(thought of as other, similar to other). The characters 
were also classified in a general stereotype category 
when the S reported that the figure represented some 
general fictional character or when she reported no 
resemblance to anyone and could not say what had 
influenced her to create this character. 

The stories were independently analyzed in order to 
identify the here and non-hero figures in each story. 
Two raters with no shared practice and with only a few 
general scoring principles were able to reach complete 
agreement on the identity of the hero figure (or the 
absence of a hero) in 90 per cent of the 180 stories 
rated. 

The analysis of these data posed certain thorny 
problems as ihe most obvious and pertinent arrange- 
ments of the data led to more observations than sub- 
jects and thus made customary statistical analysis 
inappropriate. In view of this difficulty, we followed the 
convention of presenting the data descriptively in 
what seems to us the most revealing manner and then, 
in addition, performing certain further analyses that 
permit the application of the usual tests of significance. 

In order to reduce the number of observations to the 
number of Ss, we adopted a procedure for each of the 
three relevant areas of response (similarity to self 
rather than other, denial of similarity to self, stereo- 
type) that permitted us to assign to each S a score 
representing the extent to which his six stories fitted 
with or deviated from the predicted pattern. Thus, if 
the first story that the individual told found him 
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identifying the judged hero of the story as resembling 
himself, while the non-hero figures were judged to be 
similar to others, a score of one would be assigned. If 
a non-hero figure was judged to be similar to the self, 
while the hero figure was not judged to be similar to the 
self, a score of minus one was assigned. If the hero and 
non-hero figures were treated in identical fashion, a 
score of zero was entered. The total score for each S 
consisted of the sum of the individual scores for his six 
stories and could theoretically vary from plus six 
(confirmation of the hypothesis in every story) to 
minux six (negation of the hypothesis in every story). 
This procedure provided three sets of scores represent- 
ing the extent to which the stories and inquiry re- 
sponses of the individual Ss conformed to our pre- 
dictions in regard to similarity to self versus other, 
denial of similarity to self, and stereotyped response. 


Results 


The general results of the study are sum- 
marized in Table 1, where we find the distribu- 
tion of hero and non-hero figures in each of the 
categories having to do with similarity to self 
and to other as well as the stereotype category. 
A comparison of the frequencies actually ob- 
tained with those that would be expected by 
chance alone makes it evident that the trend 
of these data is in support of the predictions 
made in advance of the study. Figures in- 
dependently rated as heroes tend to be per- 
ceived as more similar to the storyteller than 
non-hero figures, or else are denied any sim- 
ilarity to the storyteller. On the other hand, 
non-hero figures, when compared to hero figures, 
tend more often to be identified as similar to 
some person other than the storyteller or else 
are classified as stereotyped. If the categories 
implying various degrees of similarity to self 
are combined into a single category and the 
same operation is performed for categories 
representing similarity to others, it is then 


TABLE 1 
Suspyect REAcTIONS TO HERO AND Non-HERO Ficures 








| Frequencies 





Hero Non-Hero 


Reactions | 





Chance 
Expec- 
tancy 


pec- 


| ob. og Ob- | 
jserved | “:abey |served | 


cy 





Self im. 
Could be self |} 14 7.38 
Similar to self |} 21) 11.07 
Denial of self 29 | 24.19 
Other 9 | 11.07 | 
Could be other | 33 | 36.08 
Stereotype 78 | 96.76 
Total 


3.54 
10.62 | 
15.93 
34.81 | 
15.93 | 
51.92 | 
| 139.24 


2.46 | 





TABLE 2 


Smm1LariTy TO SELF AND OTHER OF HERO AND Non- 
Hero Ficures 








Hero Non-Hero 





Similar to self 38 13 
Similar to other 42 73 





possible to examine the association between 
the hero and non-hero distinction and similar- 
ity to self and other in a single 2 X 2 table. 
In Table 2, the outcome of such a procedure 
is represented, and there is clear evidence for 
association between the hero designation and 
perceived similarity to the self. It is, of course, 
not legitimate to perform the usual statistical 
analyses because of the lack of independence 
of observations. However, any such test ap- 
plied to this array of findings would indicate 
ihe predicted association at a highly significant 
level. In general, these findings hold true not 
only for the over-all distribution reported in 
Table 1, but they are also sustained when the 
same distribution for each of the six cards is 
examined separately. 

The results of our statistical comparison of 
hero and non-hero figures are presented in 
Table 3, where it is again made clear that the 
data tend to support our hypothesis. When 
heroes and non-heroes are compared in regard 
to perceived similarity to self and others, the scores 
present a significant deviation from chance in 
favor of the predicted greater similarity of 
self to hero figures and of others to figures 
judged to be non-heroes. The stories of 18 of 
our Ss revealed the predicted association 
between self-similarity and hero figures and 
other-similarity and non-hero figures; for four 
of the Ss, the hero and non-hero figures were 
treated in similar fashion, while the remaining 
eight Ss reversed the prediction. Examination 
oi the tendency to deny similarity to self reveals 
confirmation of our prediction that this would 
occur more frequently with hero than with 
non-hero figures. A number of Ss provided no 
evidence of denial in any of their responses, so 
there was no possibility of differential percep- 
tion of the hero and non-hero figures; but in 
those cases where there was such evidence, 
12 of 17 Ss showed a tendency to react with 
denial to hero figures more often than to non- 
hero figures. We also found evidence that non- 
hero figures were more often categorized 








Tue Hero AssumpTion IN TAT INTERPRETATION 


TABLE 3 
CONFIRMATION OF PREDICTIONS CONCERNING 
DIFFERENCES IN REACTION TO HERO 
AND Non-HERO Ficures 








Reaction 


| x 


al} 





Hero similarity to self and non- 
hero similarity to other 

Denial of similarity to self .343 

Sterotype 1.033 


966" | .343 2.08 | <.05 


-186 | 2.31 | <.05 
| <.01 


261 | 3.95 | 





* A positive deviation from zero indicates confirmation of the 
predicted relation. 


as stereotyped or unrelated to either self or 
to other persons than were hero figures. There 
were nine Ss who revealed no difference in the 
incidence of hero and non-hero figures who were 
perceived as stereotyped; but of the re- 
mainder, 17 reported the non-hero figures as 
more stereotyped, and only four saw the hero 
figures as more stereotyped. 

The above findings provide clear confirma- 
tion of our predictions and thus support the 
value of the hero assumption. These same 
results, however, dramatically underline the 
shortcomings of this assumption in the face of 
certain TAT stories. Although the general 
trend of the data fits with the derivation from 
the hero assumption, there are individual Ss 
who consistently reverse the prediction; e.g., 
there are some Ss who characteristically view 
non-hero figures as like themselves and hero 
figures as like others. Thus, it seems clear that 
the actual situation is more complex than a 
literal application of the hero assumption 
would imply. Consequently, it becomes an 
important investigative task for the future to 
discover something about the types of Ss or 
stories or both where this assumption may be 
applied fruitfully, as well as those where some 
other assumption should be utilized. “his same 
conclusion is supported by the genera findings 
contained in Table 1. While these results 
support our prediction, it is a group trend that 
we observe with many individual exceptions. 

A very brief summary of our Ss’ reports 
concerning what had provided them with the 
idea for their stories is contained in Table 4. 
The results summarized here make clear that 
in a large number of cases (48 per cent) the Ss 
indicate only that the story came from their 
imagination. Next most frequently mentioned 
as a determinant is the picture or some specific 
element within it (37 per cent). In those cases 


TABLE 4 
ReportTep Source or TAT Srorres 








Frequency of 


occurrence 
Source 








Imagination 

Properties of card 
Autobiographical event 
Experience of others 
Reading: fiction 
Movies, TV 

Reading: non-fiction 


* Total number of stories: 180. 





where the Ss are able to identify a specific 
experience as having suggested the story, this 
was less likely to have been a fictional en- 
counter (movies 13 per cent, novels 16 per 
cent) than an autobiographical event (21 per 
cent) or general experience and observation 
(17 per cent). Only a very small number (4 
per cent) of the stories were reported to have 
stemmed from nonfiction reading. 

In summary, the general findings of this 
study provide modest support for the hero 
assumption, although they also suggest that in 
individual cases the data do not mesh smoothly 
with this assumption. Let us turn now to the 
second study. 


CHANGES FOLLOWING FRUSTRATION VIEWED 
FROM THE VANTAGE OF THE 
Hero ASSUMPTION 


As we have already agreed, one of the 
major difficulties in evaluating the general 
utility of the hero assumption is posed by the 
close relation between internal states of the S 
and his perception of external reality. It is 
not easy to conceive of circumstances that on 
a priori grounds seem likely to produce changes 
in the motivational state of the S which will 
not be mirrored in changes in his perception 
of external reality. What we need is to con- 
struct a set of conditions under which the hero 
assumption predicts differential changes in 
hero and non-hero characteristics while the 
other assumption, of course, predicts consis- 
tent changes for both types of figures. 

Such a combination of conditions seemed to 
exist in connection with a set of data that was 
part of an earlier study (5). These data con- 
sisted of a set of TAT protocols collected before 
and after a frustration experience together with 
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an appropriate set of control protocols. Most 
important, a great deal was known about the 
details of the frustration experience and the 
Ss’ reactions to this situation. This information 
permitted us to make specific predictions con- 
cerning the changes that could be expected 
if one assumed all figures to be equally repre- 
sentative of the storyteller or if one assumed 
that there was only one figure, the hero, that 
reflected personal characteristics of the story- 
teller. 

In brief, the frustration situation was of 
such a nature that it seemed plausible to expect 
that the S’s hostility toward others would 
increase, that he would also perceive others as 
directing hostility or aggression toward him, 
and finally that he would feel guilty or direct 
aggression toward himself. All acts of aggres- 
sion within the TAT stories were analyzed in 
terms of whether the aggression was directed 
from (a) hero against other, (6) hero against 
self, (c) other against hero, or (d) other against 
other. 

The assumption that all figures were equally 
characteristic of the storyteller implied that 
there should be a significant increase in aggres- 
sive acts of all four types in view of our prior 
information concerning the increased aggres- 
sive tendencies on the part of the storyteller. 
This assumption clearly suggests that if the S 
was more aggressive, this tendency should be 
revealed evenly or equally in all figures within 
the story. On the other hand, the hero assump- 
tion implied that there should be an increase 
in only three of the four categories. Aggressive 
acts carried out by the hero against others 
should increase as a result of the storyteller’s 
increased extrapunitive tendencies. There 
should also be an increase of aggressive acts 
carried out by the hero against himself in view 
of the increased guilt or intrapunitiveness of 
the storyteller. Finally, there should be an 
increase in aggressive acts carried out by 
others against the hero as a result of his per- 
ception of the other members of the group as 
hostile toward him. There was nothing in our 
analysis of the frustration situation to suggest 
that tne storyteller saw the persons around him 
as being hostile or aggressive toward each other 
so we would not predict any change in hostility 
between non-hero figures. Consequently, we 
find that the two assumptions agree in pre- 
dicting significant changes in three of the four 


categories but are differentiated in their predic- 
tions concerning the fourth category. 

The reasoning above is readily defensible on 
rational grounds. The important feature of this 
derivation, however, is not its invulnerability 
to logical assault but rather the fact that it was 
executed prior to analysis of any data except 
that having to do with a category where no 
difference was predicted (aggressive acts car- 
ried out by heroes against others). In other 
words, before the fact, the two assumptions, 
coupled with our detailed knowledge of the 
frustration situation, seemed to lead to a 
differentiated prediction which we set out to 
test. After the fact, there is little doubt that 
with reasonable motivation and ingenuity 


either assumption could be rationalized by a 
sophisticated observer with these or almost 
any other set of empirical findings. 


Procedure 


The Ss in this investigation were 40 male under- 
graduate students of Harvard University who had 
been selected so that on the Allport-Kramer Prejudice 
Scale (1) they fell at the extremes of a group of 575 
students enrolled in an undergraduate psychology 
course. This division of the Ss into high and low 
prejudice groups is of no interest in the present inquiry, 
and our current analysis overlooks the dimension of 
prejudice except for the fact that experimental and 
control Ss were individually matched in terms of 
prejudice score. The 20 control Ss were also matched 
with the experimental Ss in age. The Ss were told 
merely that they were participating in a study of per- 
sonality structure and development. At the very outset 
of the study, each S was administered individually a 
shortened version of the Thematic Apperception Test 
consisting of Cards 3BM, 8BM, 16, and 20. At the end 
of approximately two months, the 20 experimental Ss 
were exposed to an experimentally contrived frustra- 
tion situation, and immediately following this, they 
were again given the TAT. The instructions were to 
make no effort to recall their original story, but if they 
thought of it first, to put it aside and tell the next 
story that came to mind. The controi Ss were given the 
TAT under the same conditions except that there was 
no intervening frustration experience. 

The frustration experience has been fully described 
elsewhere (9), and it is necessary here only to point out 
that the experience was carefully divorced from the 
administration of the TAT and that the S was exposed 
to multiple frustration involving both psychological 
and complex social motives. The frustration of the 
latter motives was effected in a group experiment con- 
ducted with four confederates, two male and two 
female, who were ostensibly fellow participants in the 
study. By disguised manipulation, the experimental S 
was made to fail repeatedly on the group task, which 
was presumably related to intelligence; he thereby 
failed to achieve a sizeable financial reward offered for 
successful performance, and he also kept the other 





THe Hero AssumpTION In TAT INTERPRETATION 81 


members of the group from winning financial rewards. 
At various times beginning immediately. after the 
frustration situation, detailed subjective reports were 
collected, and these, in addition to objective observa- 
tions made during the conduct of the experiment, en- 
abled us to describe quite accurately just how the Ss 
experienced or reacted to this situation. 

The TAT protocols were scored simply by counting 
aggressive acts and then coding them in terms of 
whether or not they were carried out by the hero of the 
story and whether or not they were directed toward the 
hero. The effects of the frustration situation were 
measured by means of subtracting for each exper- 
imental S the number of aggressive acts within each 
category before frustration from the number of such 
acts after frustration. From this difference score, we 
then subtracted the difference between the first and 
second administration of the test for the matched 
control S. In other words, the scores that we are con- 
cerned with represent the difference between the first 
and second test administrations for the experimental 
Ss, corrected by the equivalent shifts shown by the 
control Ss. 


Results 


The results of our analysis are summarized 
in Table 5. Utilizing either assumption, we 
predicted that aggressive acts by the hero 
against others, by hero against self, and by 
others against hero, would increase following 
frustration. There is confirmatory evidence 
for all of these predictions. The shift is in the 
predicted direction in all three cases and is 
statistically significant at the conventional 
.05 level for acts involving the hero against 
others, and others against hero. It is just short 
of this significance level for acts involving 
the hero against self. 


TABLE 5 


CHANGES FOLLOWING FRUSTRATION IN INCIDENCE OF 
__ Various TYPES OF AGGRESSIVE Acts 
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When we turn to the fourth column of Table 
5, we find the data bearing upon the differential 
predictions derived from the two assumptions. 
The hero assumption predicted no change, 
whereas the other assumption predicted a 
shift similar to that observed for the categories 
just discussed. The results are surprisingly 
definitive as the distribution of change scores 
has a mean of exactly zero; there is no evidence 
whatsoever of any shift in this category. In 
other words, the data we have reported provide 
strong evidence for the superior predictive 
efficiency of the hero assumption under the 
single circumstance where the two assumptions 
differ in their consequences. 


DISCUSSION 


It is clear that the results of these two studies 
provide some warrant for the continued use of 
the hero assumption. Our findings suggest that 
under two circumstances the derivations from 
this crude assumption fit the observed data 
better than the derivations that can be made 
from the easy assumption that all figures in 
the story are equally revealing of storyteller 
characteristics. Having agreed to this, however, 
we must hasten to emphasize the importance of 
research and formulation that will lead us to a 
more elaborate statement of the hero assump- 
tion so that under known conditions we can 
apply the kind of complexity in analysis that 
our findings as well as the convictions of most 
clinicians, imply is necessary in order to derive 
consistently sensitive and accurate inferences 
from the instrument. 

There is no discussion of the alternatives to 
the hero assumption nearly so illuminating as 
Murray’s original analysis of the hero distinc- 
tion where we find the following special cases 
proposed : 


(1) The identification of subject with character some- 
times shifts during the course of the story; there is a 
sequence of heroes (first, second, third, etc.). (2) Two 
forces of the subject’s personality may be represented 
by two different characters, for example, an antisocial 
drive by a criminal and conscience by a law-enforcing 
agent. Here we would speak of an endopsychic thema 
(internal dramatic situation) with two component 
heroes. (3) The subject may tell a story that contains a 
story, such as one in which the hero observes or hears 
about events in which another character (for whom he 
feels some sympathy) is leadingly involved. Here we 
would speak of a primary and a secondary hero. Then 
(4), the subject may identify with a character of the 
opposite sex and express a part of his personality just as 
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well in this fashion. (In a man this is commonly a sign 
of a high feminine component and in a woman of a high 
masculine component.) Finally, there may be no dis- 
cernible single hero; either (5) heroship is divided among 
a number of equally significant, equally differentiated 
partial heroes (e.g., a group of people); or (6) the chief 
character (hero in the literary sense) obviously belongs 
to the subject-object situation; he is not a component 
of the storyteller’s personality but an element of his 
environment. The subject, in other words, has not 
identified with the principal character to the slightest 
extent but has observed him as he would a stranger or 
disliked person with whom he had to deal. The subject 
himself is not represented, or is represented by a minor 
character (hero in our sense) (12, p. 7). 


Having tentatively identified such special cases 
is an important contribution, but it is equally 
essential to provide a careful specification of 
how one goes about identifying actual stories 
and figures that should be interpreted in the 
light of each of these cases. This, it seems to us, 
is a promising and largely untouched area of 
research. 

There are several tacks which such investiga- 
tion might follow. First, it is possible that one 
might be able to discover within-story cues 
that would be helpful in deciding whether to 
apply the simple hero assumption or some more 
complex version. Second, there is the ever pres- 
ent likelihood that knowledge of which assump- 
tion would be most fruitful will depend upon 
further information concerning the S himself. 
Thus, the appropriate assumption might vary 
with the cognitive style, character type, 
cultural background, or intellectual level of the 
S. Third, depending upon the situational con- 
text in which the test is administered, the 
process of story creation may vary so that 
different assumptions are warranted. When 
the test is given in a threatening assessment 
situation, we might find a different interpretive 
assumption warranted than when the test is 
given in a permissive clinical setting, where the 
S is voluntarily seeking assistance. There are, 
of course, many other types of questions that 
might be asked concerning this aspect of the 
interpretive process. However, if we knew 
something about the relation between the 
variants of the hero assumption and variation 
in the nature of the story, in the characteristics 
of the S, and in the situational context, we 
would be tremendously advanced over our 
present position. 

Granted that further research is indicated, 
do our findings in their present state provide 


us with useful information? They do! First, as 
we have indicated, they give modest support 
for those clinicians and investigators who have 
habitually employed the hero assumption in 
their use of the TAT. Second, they provide 
negative evidence for the various persons 
working with the instrument who have at- 
tempted to eliminate completely the hero 
assumption in favor of the other alternative 
we have considered. One may object to these 
inferences on the grounds that our findings are 
by no means definitive. Indeed they are not! 
But, in the absence of definitive findings, one 
uses the best evidence one can find, and it seems 
to us that the present studies fit this specifica- 
tion even if only by default. 

There remains the interesting question 
whether our findings have any implications for 
the research carried out by the many investi- 
gators interested in the quantitative study of 
the TAT who have not employed the hero 
assumption. What of investigations such as 
those by Eron (2), Hartrnan (3), Henry (4), 
and McClelland ef al. (11), where there is 
typically no distinction between hero and 
other? In spite of the highly tentative nature 
of our findings, they do seem to imply that 
these investigators might have demonstrated 
somewhat greater test sensitivity if they had 
recognized the difference between character- 
istics displayed by hero and by other figures in 
their efforts to relate test attributes to in- 
dependent measures. This same possibility 
exists in connection with some of our own 
research (8, 10) and poses an intriguing prob- 
lem for further investigation. 


SUMMARY 


We have conducted two studies designed to 
test the comparative effectiveness of ine 
assumption that there is customarily a single 
figure in TAT stories which is particularly 
revealing of the storyteller’s own attributes 
as opposed to the assumption that all figures 
in the stories are equally revealing of the sub- 
ject’s characteristics. In the first study, 30 
female undergraduate subjects were asked to 
judge the similarity between themselves and 
the figures in TAT stories they had created. 
They also reported on the similarity between 
these figures and other persons they had known, 
and described what had led them to tell each 
story. The results of the study confirmed our 
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prediction derived from the hero assumption 
that hero figures would more often be identi- 
fied as similar to self or else denied any similar- 
ity to self, while non-hero figures would more 
often be identified as similar to other persons 
or else would represent general stereotypes. 

The second study examined changes in TAT 
protocols following a frustration experience. 
Aggressive acts carried out by heroes against 
others and against the self, and also aggressive 
acts carried out by others against the hero, 
all increased following frustration. There was 
no change in the incidence of aggressive acts 
carried out by others against others. The hero 
assumption had predicted just this pattern of 
results, while the alternative assumption in- 
correctly predicted consistent increases in all 
four types of aggressive acts. 

Thus, the results of both studies provide 
evidence supporting the utility of the conven- 
tional assumption of a hero in each TAT story. 
The findings also suggest, however, that under 
certain conditions a more complex set of assump- 
tions may be desirable or necessary. 
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REINFORCEMENT OF AFFECT RESPONSES OF SCHIZOPHRENICS 
DURING THE CLINICAL INTERVIEW: 


KURT SALZINGER anp STEPHANIE PISONI 
Biometrics Research, New York State Department of Mental Hygiene 


EHAVIOR theory has recently expanded 
its scope to deal with verbal behav- 
ior (6). Greenspoon (1) demonstrated 

the effectiveness of verbal reinforcers upon a 
subject’s rate of utterance of plural vs. non- 
plural words. Hildum and Brown (2) showed 
the effect of verbal reinforcement upon atti- 
tude statements. Verplanck (7) used verbal 
reinforcers during conversations to condition 
opinion statements. Finally, Salzinger (5) in- 
vestigated the conditioning process in clinical 
interviews with schizophrenics. 

While these studies have supplied evidence 
for the validity of the application of reinforce- 
ment theory to verbal behavior, a good deal 
of research is still necessary. The present ex- 
periment is designed to study (a) reliability of 
response unit isolation, i.e., to what extent the 
interviewer can respond reliably with rein- 
forcement to the patient’s verbal behavior, 
(6) the effect of different sources of reinforce- 
ments (different interviewers) upon the verbal 
behavior of the interviewee, and (c) the rela- 
tionship between the number of reinforcements 
and the number of responses in extinction. 

Since a patient’s ability to express affect is 
usually evaluated through the interview and 
is considered an important criterion both for 
diagnosis and prognosis of schizophrenia, the 
conditions under which affect is evoked by the 
interviewer might have theoretical importance 
for arriving at laws describing interview be- 
havior and practical importance in furnishing 
an objective method for the evaluation of 
“flatness” of affect. An attempt was made, 
therefore, to examine the effect of reinforce- 
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ment upon schizophrenics’ output of affect 
responses in an interview. 


METHOD 
Subjects 


Twenty-four female and twelve male hospitalized 
schizophrenics from the age of 18 to 50, with a median 
of 34.3 years, were selected from the admissions to 
Brooklyn State Hospital. Patients were classified as 
schizophrenic upon their current admission to the 
hospital distribution center. One was later rediagnosed 
as manic-depressive. Nineteen had been previously 
hospitalized, and 17 had no history of previous hospital- 
ization. 

None of the patients received any somatotherapy 
such as insulin, electric shock, or tranquillizing drugs 
for at least one week before the first interview or during 
their participation in the study. 

The first 20 patients interviewed were placed in the 
experimental group. Fourteen were females and six 
males, with a median age of 32.0 years and a median 
number of years of education of 10.3. The other 16 
patients were placed in the control group. Twelve were 
females and 4 males with a median age of 34.5 years 
and a median number of years of education of 11.5. 


Experimental Procedure 


All patients were interviewed one week after their 
arrival in the hospital. The Ss in the experimental 
group were interviewed once by a female E and once 
by a male E on two consecutive days for a period of 30 
minutes each. Eleven of the patients were first inter- 
viewed by the male; nine were first interviewed by the 
female E. The Ss in the control group were interviewed 
once only, nine by the female and seven by the male £. 
All interviews were recorded with the apparatus in full 
sight of both patient and interviewer. 

The interview was presented to the patients as a 
routine mental hospital procedure. For the first inter- 
view, E brought the patient into the experimental room 
and explained that the interview was being conducted 
to help him. The second interview was introduced by 
telling the patient that it is helpful to patients to be 
interviewed more than once despite the fact that this 
might mean a repetition of their story. All other inter- 
view procedures were the same for the second as for the 
first interview. 

The £ questioned the patient about the following 
items: name, age, marital status, children, and siblings. 
The patients answered these questions with little hesita- 
tion, thus making it possible for E to begin the interview 
by repeating the answer given, by writing it down, and 
by saying such words as “mmm-humm,” “uhhuh,” “T 
see,” etc. This procedure was adopted in an effort to 
obtain factual information upon which subsequent 
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interview questions could be based and to establish Z 
as a source of reinforcement, in this way encouraging 
the patient to speak in the presence of EZ. The main 
part of the interview was then initiated with the 
question, “Would you tell me why you are here in this 
hospital?” 

Interviews with the experimental group were con- 
ducted in the following manner: During the first 10 
minutes (operant level), the base rate of spontaneous 
affect responses (see definition below) was determined. 
The EZ asked questions but did not reinforce any state- 
ment made by the patient. Reinforcement was defined 
as E’s verbal agreement through the use of such words 
as “mmm-hum,” “I see,” “yeah,” etc., with statements 
made by the patient. 

During the second 10 minutes (conditioning), EZ 
continued to question the patient and reinforced each 
affect response by immediately following each expres: 
sion of affect with verbal agreement. 

During the third 10-minute period (extinction), Z 
withheld all reinforcement but continued asking 
questions. 

Interviews with the control group also lasted 30 
minutes, during which time E asked questions but did 
not reinforce any of the patient’s responses. This pro- 
cedure was identical with the operant level phase of the 
experimental group procedure. 


Definition of Response 


The response class of affect for this experiment was 
defined as any statement describing or evaluating the 
state (other than intellectual or physiological) of the 
patient by himself. The response class therefore in- 
cluded all statements beginning with the pronouns “T” 
or “we” and followed by an expression of affect. Ex- 
amples include such expressions as: I am satisfied, I’m 
happy, We enjoyed it, I like him, I’m very close to him, 
I was mad at him, We hated her, !’ll always be jealous 
of him, I am upset, I am a lonely person, I was so 
ashamed, I’m sorry for him, I feel... (followed by 
any words), I was frightened, We couldn’t take it, I 
always suffer, I had a fright, etc. 

Quotations in which affect is referred to the speaker, 
although fitting all other criteria, were excluded on the 
basis of not being direct expressions of the patient’s 
affect. An example of this was, “My husband said I 
didn’t feel good.” Statements like “I am happy and 
excited” were considered as one affect statement only, 
since the pronouns “I” or “we” did not precede the 
second affective word. On the other hand, incomplete 
(in the sense that the object of the affect is not men- 
tioned) statements like “I love. . .” or “We feared. . .” 
were viewed as separate responses. 

Certain types of private events or internal states 
were excluded from the response class of affect because 
they referred primarily to intellectual processes, to 
actions which are sometimes but not always associated 
with affect, or to desires which appear to constitute a 
class of responses different from the affect class as 
defined here. I am confused, I am confident, I would 
like to. .., I want, I was surprised, I am not well, We 
forgave him, I threaten her constantly, I didn’t trust 
them, etc., are examples. 

A count per minute was taken of statements belong- 
ing to the general class of self-references (statements 


beginning with “I” or “We’’) in order to compare 
changes in the occurrence of this class with those of the 
class of self-referred affect statements. In other words, 
self-referred statements included both self-referred 
affect statements as well as self-referred nonaffect 
statements. 


Interviewer Questions 

After the initial question, “Why are you here?” E 
asked additional questions only when the patient 
ceased talking for at least two seconds. Some or all of 
the following topics were discussed during each inter- 
view: reasons for being in the hospital and causes for 
illness; patient’s relationships to ‘his parents, siblings, 
fellow employees, employers, fellow students and 
teachers, wife or husband, children, friends; patient’s 
activities during free time, and plans for the future. 
The E made an attempt to balance these topics over 
the different conditions. For instance, if the patient 
discussed the symptoms of his illness in the operant 
level condition, E asked questions regarding the pos- 
sible causes of the illness during conditioning and 
brought the patient back to these topics during ex- 
tinction. As long as the topics were approximately 
balanced over the three conditions, however, E took 
his cues as to topic from the content of the patient’s 
statements. Both the number of topics and the order in 
which they were discussed varied from interview to 
interview. 

Questions* asking directly for affect such as, “How 
did you feel about that?” or “Were you happy?” were 
not used. 


RESULTS 
Reliability of Response Unit Isolation 


A sample of 15 recorded interviews was 
coded independently for self-referred affect 
by the two interviewers. Proportions of agree- 
ment based on the number of affect statements 
counted were computed separately for each 
condition of each interview and ranged from 
.79 to 1.00. Examination of the disagreements 
revealed that they were primarily due to poor 
recording. It was therefore concluded that the 
affect responses as defined in this experiment 
can be objectively isolated and counted. 


? A pilot study of the effect of interviewer questions 
of this type of interview was undertaken by the follow- 
ing four graduate students: Ruth Beach, R. S. Feld- 
man, P. Goldberg, and Marilyn G. Hamlin. They 
found no significant relationship between the condi- 
tioning effect and the following: specific vs. nonspecific 
questions, the introduction of new topics vs. the 
continuation of old topics, the total number of ques- 
tions asked, and the number of questions indirectly 
leading to affect (e.g., How did you react to that?). A 
larger more definitive study of question effects is 
presently being undertaken by the authors. 
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Interviewer Differences 

The adequacy of the definition of the re- 
sponse was examined by having the two Es who 
served as interviewers in the experiment inde- 
pendently code the affect responses of the same 
15 recorded interviews. This procedure made it 
possible to test whether both Es would have 
reinforced the same responses in identical inter- 
views. 

In order to determine whether the two inter- 
viewers evoked a different number of affect 
statements, comparisons were made between 
the two interviewers on each condition of the 
initial interviews of the experimental group and 
the control group interviews. The Mann- 
Whitney test yielded no statistically signifi- 
cant differences (p > .05), suggesting that the 
two interviewers evoked approximately the 
same number of affect statements in their 
respective interviews. The exact p levels for 
the experimental group interviews were .79 for 
the operant level, .54 for the conditioning 
period, and .52 for the extinction period. For 
the control interviews, the levels were consid- 
erably lower, although still not significant. The 
p level for the first 10-minute period was .07, 
for the second .06, and for the third .22. 


Base Level of A ffect 


In order to determine whether the experi- 
mental and control groups differed initially in 
the amount of affect spontaneously emitted, 
a statistical comparison of the number of affect 
statements given in the first 10 minutes of the 
control interviews with the number of such 
statements given during the operant level of 
the experimental interviews was made. The 
difference was not statistically significant (p = 
.37) by the Mann-Whitney test (4). 


Conditioning Effect 

The difference between operant level, con- 
ditioning, and extinction for the initial inter- 
views of the experimental group was tested by 
Wilcoxon’s nonparametric analysis of variance 
(8) and found to be statistically significant 
(p< .01). The greatest number of affect ‘state- 
ments was emitted during conditioning (sum of 
ranks = 51.0), the next greatest during the 
operant level (sum of ranks = 39.5), and the 
least during extinction (sum of ranks = 29.5). 
The difference between the three conditions of 
the second interviews of the experimental 


group was not statistically significant (.2 > p 
> .1).Inspection of the sums of ranks for the 
three conditions, however, still revealed the 
greatest number of affect statements in the 
conditioning period (sum of ranks = 45.0), 
the next greatest number of affect statements 
in extinction (sum of ranks = 41.5), and the 
smallest number during operant level (sum of 
ranks = 33.5). The fact that the second inter- 
view did not yield a statistically significant 
difference between the conditions appears to 
be due largely to the greater number of re- 
sponses emitted during extinction in the second 
interview in contrast to the first. This is also 
evident in the comparison of Figs. 1 and 2, 
where responses during extinction were plotted 
as a function of number of reinforcements, 
from which one can see the steeper slope in the 
second than in the first interview. In other 
words, this result does not indicate a lack of 
reliability over time but, 1ather, that appar- 
ently fewer reinforcements were necessary for 
the same number of extinction responses in 
the second than in the initial interview. This 
effect is generally reported for reconditioning. 

Whea the same test was used to compare the 
three 10-minute periods of control group inter- 
views, no significant difference was found (p = 
.75). The greatest number of affect responses 
was emitted during the last 10 minutes of the 
interview (sum of ranks = 36.0), the next 
greatest during the second 10 minutes (sum of 
ranks = 34.0), and the smallest number during 
the first 10 minutes (sum of ranks = 32.0). 

The Mann-Whitney test was used to com- 
pare each 10-minute period of the experimental 
group interviews to its corresponding period in 
the control interviews. Comparison of the first 
10 minutes of the experimental with the con- 
trol interviews yielded no significant difference 
(p = .37). Comparison of the last 10 minutes 
also yielded no significant difference (p = .92). 
The difference between the second 10-minute 
period of the control-group interviews and the 
conditioning period of the experimental-group 
interviews was statistically significant (p = 
.03), using the one-tailed test hypothesis that 
the experimental group would emit more affect 
than the control group. 

Figs. 3 and 4 represent individual cumulative 
curves of affect responses over the three condi- 
tions of the experimental group interviews. Fig. 
3 shows the curves of three individuals whose 
rate of response was modified by reinforcement. 
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These individuals were selected on the basis of 
being represeutative of low, medium, and high 
rates of response in operant level. Fig. 4 shows 
the curves of three individuals whose rates of 
response were not modified by reinforcement. 
These individuals were also representative of 
their group. 

In order to gauge the lawfulness of the pro- 
cess of conditioning in the experimental group, 
the number of responses of each patient during 
extinction was plotted against the number of 
reinforcements in conditioning. It was found 
that the relationship could be described by a 
linear equation, i.e., the greater the number of 
reinforcements administered, the greater the 
number of responses emitted during extinction. 
The goodness of fit can be seen by examining 
Figs. 1 and 2. Two patients appear to deviate 
markedly from the rest of the sample. The 
diagnosis of one of these patients was changed 
from schizo-affective to manic-depressive psy- 
chosis. She received 13 reinforcements and 
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gave 20 extinction responses. The other deviate 
from the group, who received 27 reinforce- 
ments and gave only 9 extinction responses, 
was later found to be hard of hearing. 

In order to investigate further the effects of 
reinforcement, rank-order correlation coeffi- 
cients were computed between every pair of 
conditions, separately for the first and second 
experimental interviews and the control inter- 
views. As expected for the experimental group, 
the highest correlations were found between 
the number of reinforcements and the number 
of extinction responses for both the first (.73) 
and second (.60) interviews. The correlations 
between all other pairs of conditions were much 
lower, varying within a restricted range from 
1 to .50 (see Table 1). The correlation be- 
tween the two extinction periods of Interviews 
1 and 2 was .41 (p < .05) and that between 
extinction of Interview 1 and conditioning of 
Interview 2 was .44 (p < .05). 








TABLE 1 


RANK-ORDER CORRELATIONS BETWEEN THE 
Taree TEN-MINUTE PERIODS OF THE 




















INTERVIEWS 
Experimental 
Conditions . Control 
First Second 
Interview| Interview 
Conditioning vs. Extinction | .73** | .60** | .74** 
(2nd vs. 3rd 10 minutes) 
Operant level vs. Condition- | .46* .47* 85** 
ing (1st vs. 2nd 10 min- 
utes) 
Operant level vs. Extinction | .41* .50* .70** 
(1st vs. 3rd 10 minutes) | 





*~ < 05. 
** > < .01. 


In direct contrast to the results of the experi- 
mental group, the correlations between the 
conditions in the control group were all evenly 
high, ranging from .70 to .85. 

Since every affect statement in the second 
period of the experimental group was rein- 
forced, the question arises whether the correla- 
tion between number of reinforcements and 
number of responses in extinction merely re- 
flects a correlation between the affect state- 
ments in different parts of the interview. 

Kendall’s tau (3) was computed in order to 
partial out the correlations between operant 
level and conditioning and operant level and 
extinction from the reinforcement-extinction 
correlation in the experimental group. The 
tau of .58 (p = .0002) between number of rein- 
forcements and extinction responses for the 
first interview became .53, and the tau of .45 
(p = .002) for the second interview became 
.37. When the first 10-minute period of the 
control interviews was partialed out of the 
correlation between the second and third 10- 
minute periods, the tau of 59 (p = .002) 
became .38, a drop of .21, whereas the corre- 
sponding drop in the experimental group was 
only .05. 

This indicates that only a small part of the 
correlation between number of reinforcements 
and number of responses in extinction can be 
accounted for by the correlation between op- 
erant level and number of reinforcements; the 
correlation between the second and third parts 
of the control group, on the other hand, can 
be accounted for in larger part by the correla- 
tion between the first and second part of the 
interview. 
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The rate of making self-referred statements 
was found to be invariant over the three con- 
ditions of the first and second interviews of the 
experimental group as well as the interviews 
of the control group. This was tested by 
Wilcoxon’s (8) nonparametric analysis of vari- 
ance (p > .05). Inspection of the sums of 
ranks for the first experimental interview 
(based on an N of 20) indicated a trend toward 
decreasing frequency of self-referred state- 
ments from operant level (sum of ranks = 48.0) 
to conditioning (sum of ranks = 37.0) to 
extinction (sum of ranks = 35.0), while in the 
second interview, the greatest number of self- 
referred statements appeared in conditioning 
(sum of ranks = 46.0), the next greatest in 
extinction (sum of ranks = 39.0), and the 
smallest in operant level (sum of ranks = 
35.0). In the control group (based on an N of 
16), the greatest number of self-referred state- 
ments was emitted during the first 10 minutes 
(sum of ranks = 38.5), the next greatest 
during the third 10 minutes (sum of ranks 
= 3.5), and the smallest during the second 10 
minutes (sum of ranks = 30.0). 


DISCUSSION 


While it is true that two different inter- 
viewers evoked the same number of affect 
statements within the margin of random error, 
the fact that two of the comparisons ap- 
proached the .05 level of significance would 
seem to indicate that there is still room for 
greater control of interviewer behavior or that 
such interviewer characteristics as sex, age, 
appearance, etc., play an important role in 
controlling the interviewee’s behavior. 

While Verplanck (7) was unable to test the 
constancy of the response used during any 
one conversation or among different conver- 
sations, the recording done in this experiment 
made it possible to demonstrate that a response 
class decided upon prior to the interview can 
be reliably reacted to by different interviewers 
and coders. 

Although this experiment gave definite 
evidence for conditioning, there was much 
variability among individuals. This was not 
surprising in view of the fact that the inter- 
viewing was carried out with schizophrenics 
and the response conditioned one of affect 
statements. This variability, both in operant 
level and in susceptibility to reinforcement, 









ri- 


= wef © 








REINFORCEMENT OF AFFECT RESPONSES 89 


might well provide an objective prognostic 
measure of degree of flatness of affect. Such 
a measure would be of value since there are 
strong indications (10) that marked flatness of 
affect augurs badly for the outcome of schiz- 
ophrenia. Follow-up of the patients inter- 
viewed in this sample may allow an exact test 
of the relationship between outcome of illness 
and flatness of affect. 

Perhaps one of the most interesting findings 
of this study was the linear relationship be- 
tween number of reinforcements and number of 
responses in extinction. In a similar study, 
Williams (9) found that the relationship be- 
tween number of reinforcements and number of 
responses during extinction for food-deprived 
rats was also linear up to about 30 reinforce- 
ments. Since fewer reinforcements were ad- 
ministered in the present study, it will cer- 
tainly be of interest to try to duplicate the rest 
of Williams’ curve for verbal behavior. The 
results become even more dramatic when 
account is taken of the fact that flatness of 
affect, as defined here by frequency of affect 
statements, appears to vary directly as a func- 
tion of the interviewer’s reinforcing behavior. 
The implications for the regular psychiatric 
interview are self evident. 

The two patients in the experimental group 
who showed atypical relationships between 
reinforcement and extinction, are noteworthy 
because both deviated in directions that seem 
sensible on a post hoc basis. The hard-of-hearing 
individual got more reinforcements for the 
number of responses she emitted in extinction 
than the Ss in the rest of the sample. This, of 
course, is exactly what one might expect if S$ 
could not hear all the verbal reinforcements 
given her. The patient whose diagnosis was 
changed from schizophrenic to manic-de- 
pressive psychosis, manic stage, gave many 
more affect statements in extinction than 
might be expected from the number of rein- 
forcements she received. This observation 
suggests the possibility that if a sample of 
manic-depressives was administered the same 
number of reinforcements as the schizophrenic 
group, a linear relationship might similarly 
be found but with a steeper slope. Manic- 
depressives may require a smaller number of 
reinforcements than schizophrenics for the 
same number of responses in extinction. 

The correlation coefficients between all 
possible combinations of conditions were com- 





puted in an attempt to see whether the rela- 
tionship between number of reinforcements 
and number of responses during extinction 
could be explained simply as a correlation that 
might be obtained between any two 10-minute 
periods of the same interviews. Table 1 shows 
that while correlations occurred between all 
conditions, the highest were between the num- 
ber of reinforcements and number of responses 
during extinction in the experimental group. 
Furthermore, upon partialing out the operant 
level correlation to study the relationship to 
be expected between any two 10-minute peri- 
ods, no substantial change occurred in the 
correlation between number of reinforcements 
and frequency of response in extinction. The 
correlations between the conditions in the 
control group yield further evidence for the 
conditioning effect in the experimental group. 
The fact that they are all approximately 
equally high, whereas the reinforcement- 
extinction correlations ii the experimental 
group are outstandingly high by comparison 
to the operant-extinction and operant-rein- 
forcement correlations, argues strongly for the 
effect of reinforcement. 

A final control on the effect of reinforcement 
in the interview, that of the rate of self-re- 
ferred statements, was used in order to check 
on the possibility that the reinforcement func- 
tioned merely to make the patient feel more 
at ease and therefore talk more about himself. 
This analysis indicated that the conditioning 
effect was specific in producing an increase 
in self-referred affect statements and not in 
increasing the general class of self-referred 
statements. 


SUMMARY 


Thirty-six hospitalized schizophrenics were 
included in this study. Twenty of them (the 
experimental group) were interviewed for a 
period of 30 minutes each on two consecutive 
days by two interviewers. The other 16 (the 
control group) were given one interview only, 
which lasted for 30 minutes. Each interview 
in the experimental group consisted of a 10- 
minute operant level, during which E only 
asked questions necessary to keep up the 
patient’s talk but did not respond to the 
patient’s speech; 10 minutes of conditioning, 
during which E reinforced by agreement all 
self-referred affect statements; and, finally, 
10 minutes of extinction, during which E with- 
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held all further reinforcement. Each interview 
in the control group consisted of 30 minutes of 
operant level only. 

it was demonstrated that a difference in 
interviewers or sources of reinforcement per 
se need not produce discrepant results during 
an interview when utilizing a standard pro- 
cedure for interviewing. It was further shown 
that a verbal response class can be reliably 
isolated and reacted to. Conditioning of the 
response class of self-referred affect state- 
ments was found to be possible with schiz- 
ophrenics during an otherwise usual clinical 
interview. The relationship between number of 
reinforcements and number of responses in 
extinction was described by means of a straight 
line, i.e., the greater the number of reinforce- 
ments, the greater the number of extinction 
responses. 

The lawfulness of these findings indicates 
that the clinical interview is subject to in- 
vestigation by experimental techniques. Fur- 
thermore, a controlled interview may prove 
useful as a xesearch tool. 
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THE EFFECTS OF PUNISHMENT (ELECTRIC SHOCK) 
ON PERCEPTUAL LEARNING! 


HAROLD J. McNAMARA, CHARLES M. SOLLEY, anp JOHN LONG 
The Menninger Foundation 


HE purpose of these studies was to ex- 
plore the consequences of the association 
of punishment with percepts and to 
trace the residual effects of such associations in 
several aspects of perceptual and memoric 
organization. Past experiments in this area 
(e.g., 1, 4, 7, 10, and 11) have not yielded con- 
sistent results on the questions, ““What are 
the conditions of learning which lead to ‘per- 
ceptual emphasis’ and what are the conditions 
of learning which lead to ‘perceptual deempha- 
sis’ of percepts associated with punishment.” 
Although Tolman (12) speaks of punished 
events standing out as things to be avoided, 
and Postman and Brown (6) have shown that 
people can become sensitized to objects of 
failure, there are strong data indicating that, 
under certain conditions of learning, punished 
events are “perceptually repressed” or “per- 
ceptually deemphasized” (4, 7, 10). The use 
of radically different experimental procedures 
and tests, different stimulus materials, differ- 
ent intensities of punishments, etc., make 
interexperimental comparisons almost impos- 
sible. The results of such studies, however, 
suggest several important variables, and a 
series of experiments was designed in which 
these pertinent variables were manipulated 
within a single procedural framework. 

In previous research, the time interval be- 
tween the percept and the punishing event 
varied unsystematically from study to study. 
Smith and Hochberg (11) used contiguous 
pairing; Pustell (7) used .2 second; Dulany (4) 
used 2 seconds; and Ayllon and Sommer (1) 
used 3 seconds. Since research on classical 
conditioning indicates that the amount of 
associative strength is a function of the CS-US 
interval, it is necessary to vary this time inter- 
val systematically or else to hold it constant. 


1 This research was supported by a research grant 
(M-715C) from the National Institute of Mental 
Health of the National Institutes of Health, US Public 
Health Service. The authors are indebted to Gardner 
Murphy who gave invaluable guidance and suggestions 
and to Robert Sommer who provided much of the 
original stimulation for these studies. 
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Still another factor is the intensity of the 
punishing agent, which varies considerably 
from study to study and is reported in terms 
of subjective evaluation, e.g., “unbearable” 
(4, 7), “quite painful” (10), etc., as well as 
in terms of objective voltage levels (11). 
It was decided to use both subjective evalua- 
tions and objective criteria, and to determine 
the relationship between these two methods of 
ordering “intensity” of punishment. 

The third major variable that could play 
an important role in the association of punish- 
ment with percepts is the possibility of escape 
from the punishing agent. Ayllon and Sommer 
(1) had permitted subjects to escape from the 
punishment; Pustell (7) and Dulany (4) used 
a design that permitted neither escape nor 
avoidance. Reece’s study (10) involved both 
escape and no-escape conditions, and he found 
striking differences between these two condi- 
tions both in terms of threshold changes for 
the associated percept and in terms of subjec- 
tive interpretation of the “meaning” of the 
electric shock. 


EXPERIMENT I 

Each experiment in the series followed the 
same basic paradigm, viz., that employed by 
Ayllon and Sommer (1). The first experiment 
was designed to obtain information about the 
effects of electric shock at two levels of inten- 
sity, 40 v and 60 v, under restrictions of no- 
escape and simultaneous association. It was 
hypothesized that Ss who rated the shock 
“moderate” to “very unpleasant” would per- 
ceive the shock-associated stimulus in the test 
situation, whereas those Ss who rated the 
shock “not unpleasant” to “slightly unpleas- 
ant” would perceive the nonshocked stimulus 
in the test. This hypothesis was based on the 
results of the Ayllon and Sommer study (1). 


Method 


Apparatus. The stimulus materials used in the 
present study were the same three-dimensional re- 
versible profiles used in the Ayllon-Sommer study (1). 
These are shown in Fig. 1. The profiles were formed by 
grooves in 12” X 12” plaster casts. The grooves were 
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Fic. 1. DiaGRAMMATIC DRAWINGS OF PLAQUES 


8” in length, 54” in width, and 4%” deep. There were 
four of these plaques: Rufus, a left-pointing face; 
Clem, a right-pointing face; Horace, a full face used as a 
set-breaking figure; and a posttest profile line similar 
to both Rufus and Clem. Since the right-pointing, left- 
pointing, and ambiguous posttest profiles were cast 
from the same mold, the center profile line was identical 
for each figure. Meaningful backgrounds were painted 
in colored enamel on the plaques so that during the 
tracing of Rufus or Clem, visual recognition of the 
other face would be precluded. S wore a rubber glove 
on the tracing hand to minimize extraneous tactual 
cues and compel him to rely solely on his figure-ground 
organization of the center line in determining the 
identity of the particular face. 

The shock apparatus used in the present study was a 
Variac type with an adjustable time delay system. With 
this apparatus, the delay and duration of shock could 
be controlled precisely. The variac was equipped with 
an attached voltmeter that made it possible to com- 
pensate for fluctuations in line voltage. Shock was 
controlled by a foot pedal that made it difficult for S 
to predict shock by means of body movement cues 
from E. 

Shock was applied to the index and middle finger of 
the left hand. The electrodes were rubber cups filled 
with moist 10% zinc oxide compound, mounted on a 
small wooden block and covered with a tight elastic 
band (glove-like) from which S could not remove his 
hand. In this way, contact pressure, contact area, and 
moisture from perspiration were controlled and held 
constant for each S. 

Subjects. The Ss in Experiment I were 29 female 
undergraduate students enrolled in the elementary 
psychology course at the University of Kansas. None of 
these Ss knew, prior to the experiment, that shock 
would be used. Fifteen Ss were assigned to the Clem— 
Shock, Rufus—No-shock group, and 14 were assigned 
to the Clem—No-shock, Rufus—Shock group. 

Procedure. The method was similar to the Ayllon- 
Sommer procedure with some changes. The Ayllon- 
Sommer procedure was to present the profile that was 
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to be associated with shock, administer an electric 
shock of one-second duration three seconds after 
completion of the tracing, remove the profile, and then 
present the next profile. The major change from this 
procedure was to administer the shock during the 
tracing of the plaque (specifically, when S’s index finger 
made contact with the mouth of the profile). 

In addition to this change, several minor procedural 
modifications were incorporated into the present ex- 
periment. The order of presentation of the faces was 
changed in the hope of overcoming the preference 
Aylion and Sommer had found for Rufus, the left- 
pointing profile. 

The order of presentation of the faces was as follows: 
Clem, Horace, Horace, Rufus, C**, H, R*, C, R, H, R*, 
c™. (oe. Rew, €™, Re, B,C”, &, BSG, 
C**, H. The £ presented one face at a time and called 
out the appropriate name while placing the face in 
front of S. Certain of these trials involved electric 
shock. Those marked with a single asterisk indicate 
shock for the Rufus—Shock group; a double asterisk 
indicates shock for the Clem—Shock group. In order 
to deal with the criticism that shock during the tracing 
may have a distracting effect, the faces were traced 
twice and shock was always given on the second tracing 
of the face. 

Another method was used to define the severity of 
shock along with ratings of the shock by E and by S. 
Shock was defined in terms of the applied voltage, that 
is, half of the Ss were given 40-volt shock, and half 
were given 60-volt shock. Experience of the shock as to 
degree of unpleasantness was determined by S’s rating 
of the shock along a five-point scale from “not-un- 
pleasant” to “very unpleasant,” and by £’s rating of 
S’s behavioral reactions as either slight, moderate, or 
strong. The E’s rating of shock was made after blind- 
folding S and prior to S’s first response in the test trials. 

E rated each S in terms of the overt reaction shown 
to the shock: strong, moderate, and slight. The criteria 
for rating a reaction as sfromg were such responses as 
flinging of the arm whenever shock was applied, raising 
of finger from the table in absence of shock, and 
verbalizations regarding the shock (e.g., “Gee, that 
hurts,” “How long is this go’ag to last”). If S showed 
any or all of these reactions to a lesser degree, E rated 
her reaction to the shock as moderate. On the other 
hand, if S showed few or none of the above mentioned 
behaviors, E rated her reaction to the shock as slight. 
Immediately following the test, S was asked to rate the 
severity of the shock on a five-point scale from “not 
unpleasant” to “very unpleasant.” 

In the test series, there was a total of 27 presenta- 
tions: 14 of the ambiguous profile line and 13 of Horace 
(set-breaking full face). The latter was alternated with 
the ambiguous profile in the following sequence: 
Ambiguous, Horace, H, H, A, A, H, A, H, H, A, H, A, 
H, A, A, H, A, H, H, A, H, A, A, H, A, A. 


Resulis 


Analysis of the data based on S’s first re- 
sponse to the ambiguous figure is presented 
in Table 1. The data show that of the entire 
group, 21 Ss reported the nonpunished face 
as their first response while 8 Ss reported 
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TABLE 1 


First RESPONSE GIVEN TO AMBIGUOUS 
PLraqgue Durinc TESTING SEQUENCE 
(Exe. I) 
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is Rhed | 
_ Face | 
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Shock Intensity 
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Rating by S | Slight 

| Moderate 
Rating by EZ | Slight 
Moderate | 


Rating by S 
and E* 


0 
Slight 4 
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| Moderate 





* Instances (NV = 9) where S and E disagreed were discarded. 
** y? corrected for continuity. 


TABLE 2 
SUMMARY OF ANALYSES OF MEAN NUMBER OF 


PUNISHED Faces REPORTED TO AMBIGUOUS 
I) 











Measure of 
Shock 





Physical 


Rating by S | Sli 
1 §. 


Rating by Z£ | Slight | 7. 


2.95°° 
Moderate | 5.5 | 


2.70°** 


Rating by S| Slight 7. .90 | 2.56°**| .95 
and E* Moderate | 5. 9 | .71} | 2.79°* 





* Instances (V = 9) where S and E disagreed were dis- 
carded. 
** » (Two-tail) .05. 
*** » (Two-tail) .0i. 


the punished face (p < .01). There was no 
significant difference found between groups. 
However, all measures of severity of shock 
reveal that the less intense shock had a less 
marked effect than did the more intense shock. 

Analysis of the mean number of total re- 
sponses to the ambiguous profile is given in 
Table 2. Here the difference between groups 
is significant for all measures of shock, showing 
that the group receiving or experiencing the 
stronger shock reported more nonpunished 


TABLE 3 


Face REMEMBERED Most Vivipty (Exp. I) 
AFTER TRAINING AND TEST 
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* Instances (V = 9) where S and E disagreed were discarded. 


faces than the other group. Intra-group com- 
parisons with chance expectancy are also 
shown in Table 2. The results are similar to 
the first response data. The 40-v shock had 
little or no effect, although there is a trend 
toward more punished responses; while the 
60-v shock group shows a significant reporting 
of the nonpunished face. 

Immediately after the test series, the Ss 
were asked which face they remembered most 
vividly. If they answered “Horace,” the set- 
breaking profile, they were asked “which 
next?” Table 3 presents the faces, other than 
Horace, that were remembered most vividly. 
Although there is a definite trend for the slight- 
shock group to remember the punished face, 
only the physical measure of shock gives sig- 
nificant results. There is also a slight trend for 
the moderate shock group to report the non- 
punished face. However, this tendency is 
not significant. 


EXPERIMENT II 


Since the strongest effects had been found 
using 60 v in Experiment I, a second study was 
designed to investigate the temporal interval 
between percept and punishment. In this ex- 
periment, the procedure was the same as in 
Experiment I, except that the 60-v shock was 
administered three seconds after tracing of 
the plaques. 
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Subjects 


Forty-one female undergraduates at the University 
of Kansas were used. Twenty-six of these were ran- 
domly assigned to the experimental group, and 15 
were randomly assigned to the control group. 


Results 


In this experiment, the primary interest was 
the investigaticn of the effect of moderate to 
very unpleasant shock applied three seconds 
after the tracing of the faces. At this intensity 
(60 v), only six of the 26 Ss reported the shock 
as slightly unpleasant, and 20 Ss reported the 
shock as moderately to very unpleasant. In 
addition, £ rated 21 of the 26 Ss as reacting 
strongly to the shock. 

The data on first response and vividness 
were not significant. The analyses of the total 
responses to the ambiguous profile for all Ss, 
for Ss who rated the shock as moderate, and 
for Ss whom E rated as reacting strongly 
yielded essentially the same result. The mean 
number of punished responses was 5.54, and 
the mean number of nonpunished responses 
was 8.46. A ¢ test between these two means 
yielded a value of 2.80, significant at the .01 
level. The results are in accordance with those 
of Smith and Hochberg (11) but show no 
trends in the direction reported by Ayllon and 
Sommer (1). 

The temporal relationship between the shock 
and the stimulus (Ayllon and Sommer gave 
shock three seconds after presentation of 
the stimulus) was not found to be a significant 
variable. In Experiment I, shock during the 
presentation of the stimulus, and in the present 
study, shock after presentation gave the same 
results; Ss reported a significant number of 
nonshocked faces when strong shock, or shock 
rated as moderate to very unpleasant, was 
used. 

There is some reason to propose a “repres- 
sion” model to explain these results. Tolman 
(private communication) suggested that the 
effects of punishment upon learning or percep- 
tion do not evidence themselves immediately; 
retesting after a time lapse might reveal more 
pronounced repressive effects. Rapaport (8) 
makes the similar suggestion that repression 
occurs after a time lapse rather than immedi- 
ately. For a preliminary test of this proposi- 
tion, all Ss in Experiment I were phoned three 
weeks to two months after the initial testing 
session. Of 29 Ss, only 17 Ss were able to recall 


either Rufus or Clem. The recall data from 
these 17 Ss were compared with their immedi- 
ate recall at the end of the initial test session, 
and it was found that four Ss changed in terms 
of the face remembered most vividly. All four 
changed from remembering the punished face 
to remembering the nonpunished face most 
vividly. No Ss switched from the nonpunished 
to the punished face in delayed recall. These 
results in conjunction with the data in Experi- 
ment I and Experiment II suggest that “re- 
pression” is operating in this experimental 
situation. 


EXPERIMENT III 


On the basis of the first two exploratory 
experiments, a final study of bruader scope was 
planned, involving the variables of “escape 
possibilities,” physical intensity of shock, and 
perceived intensity of shock. Obviously, the 
latter two variables are somewhat correlated. 

It was hypothesized, first, that when S can 
escape from shock by alerting himself to cer- 
tain percepts and not to others, he tends to 
perceive the shock-associated percept in the 
test situation. On the other hand, when S can- 
not escape from the shock, “perceptual re- 
pression” occurs in that he tends to perceive 
the nonshocked stimulus in the test situation. 
The only restriction placed on this hypothesis 
was that under low intensities of shock (far 
below pain threshold) there should be a dyna- 
mogenic effect (9) and both “escape” and 
“nonescape”’ groups should perceive the shock- 
associated profile in the test. 

As the intensity of the shock is increased, 
second, a progressively greater difference be- 
tween escape and no-escape conditions is 
hypothesized: specifically, a significant inter- 
action between intensity of shock and escape 
conditions. 

Our third major hypothesis asserts that 
perceptual alerting and perceptual repression 
are a complex function of the physical intensity 
of the electric shock. We did not attempt to 
specify the exact form of this functional rela- 
tionship. 

Fourth, we expected a significant correlation 

tween perceived intensity of shock and per- 
ceptual alerting or perceptual repression. 

A fifth hypothesis was proposed tenta- 
tively. We had observed in Experiments I and 
II that there were shifts in recalled vividness 
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of faces with Ss who perceived the shock- 
associated profile in the test, whereas Ss who 
perceived the nonshock-associated profile in 
the test recalled the same profile most vividly. 
We hypothesized on this basis that Ss brought 
back into the same experimental room and 
given a replication of the test without repeated 
training would perceive the nonshocked profile 
predominantly. The basis for this argument 
was that “anxiety” associated with the train- 
ing would continue incubating (Bindra and 
Cameron [2]) and would subsequently “repress” 
the shock-associated percept. 
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Fic. 3. AVERAGE NuMBER OF SHOCKED FACES 
REporTED BY Eacu Escape CONDITION 


Subjects 


Seventy-nine female undergraduates at the Uni- 
versity of Kansas served as Ss. None knew that shock 
was to be used, and none reported knowing about the 
study in advance. Ss were randomly assigned to one 
of seven experimental conditions. 


Apparatus and Procedure 


The apparatus and materials were the same as 
those used in Experiments I and II, and the pro- 
cedure was basically the same. Shock levels of 18, 25, 
50, and 75 volts were used. In the 75-volt condition, 
only six volunteer Ss were tested and only under 
conditions of escape. In the no-escape condition, a 
tight elastic glove was fitted around the S’s hand so 
that she could not remove her hand from the electrodes. 
In the escape condition, the elastic glove was omitted, 
and S could remove her hand from the shock apparatus. 

For the delayed testing procedure, only the 50-volt 
group was brought back and given a replication of the 
test. These Ss were given only the test procedure and 
no additional shock was administered. The 30 Ss were 
divided into three groups of 10 Ss with an equal distri- 
bution of Ss from each escape condition. Group I was 
retested after 7, 14, and 28 days; Group II after 14 
and 28 days; Group III after 28 days. 


Results 


First Response. The first response given by 
each S in each condition was scored as to 
whether it was the shocked or the nonshocked 
profile. Figure 2 shows the percentage of Ss, in 
each condition, reporting the shocked profile 


TABLE 4 


ANALYSES OF VARIANCE FOR ESCAPE AND 
No-Escapre CONDITIONS 








| 


] 
’ 
Source df | Mean | F P 


Condition |square 





" 
18, 25, 50, 75 volts | 3 3.28 


32.20 | 
Within | 39 | 9.81 


Escape <.05 





No-escape | 18, 25, 40, 50, 60 volts| 4| 16.97 | 2.28 | <.10 


Within | 39) 7.42 





TABLE 5 
COMPARISON OF NUMBER OF SHOCKED Faces 
REPORTED IN ESCAPE AND No-ESCAPE 
CONDITIONS 








| | 
Escape 
Conditions x | SDz ’ 





9.62 | .943 
7.00 | .966 
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Escape 1.9% | 


No-escape 
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Escape 
No-escape 





Escape 
No-escape 
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with voltage level of the associated shock. 
This figure makes use of the data from Experi- 
ment I where 40 and 60 volts were presented 
under no-escape conditions. In the escape 
conditions, 24 out of 43 Ss reported the non- 
shocked face, whereas in the no-escape con- 
ditions, 42 out of 64 Ss reported the non- 
shocked face. Overall, 66 out of 107 Ss reported 
the nonshocked face. 

Total Response Scores. The total number of 
times, out of 14 possibile, that S reported the 
shocked face was also taken as a score. The 
relationship between average number of 
shocked faces reported and associated voltage 
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level is shown in Fig. 3. There is a difference 
between the various levels of shock at the .05 
level for the escape group and at the .10 level 
for the no-escape group. This is shown in the 
analyses of variance summarized in Table 4. 
The same data were also analyzed by means of 
nonparametric median tests which confirmed 
the results of the analyses of variance. 

Table 5 summarizes the means and standard 
error of the means of the number of shocked 
faces reported for each of the conditions in- 
volved in this experiment, as well as ¢ tests 
between escape and no-escape conditions at 
each level of shock. 

Most experimenters do not state the physical 
values of the shock presented to S but only 
the fact that the shock was rated at a certain 
unpleasantness level. Figure 5 accordingly 
disregards voltage levels and shows the rela- 
tionship between average number of shocked 
faces reported and judged unpleasantness. 
There is an almost perfectly symmetrical 
relationship on either side of a horizontal 
line, between the two variables for the escape 
and no-escape groups. Figure 4 shows the re- 
lationship between average rank unpleasant- 
ness and voltage administered. There is a 
strong, positive correlation between perceived 
or judged unpleasantness and the physical 
values of the shock for the escape group but 
little or no correlation between these two 
indices of punishment for the no-escape group. 

Recall Analyses. Immediately following the 
test phase of the experiment, the Ss had been 
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asked which face they recalled most vividly, 
which next, and so on. Figure 6 shows the 
relationship between proportion of Ss recalling 
the shocked face more vividly than the non- 
shocked and voltage level used in association. 
Both the escape and no-escape groups show a 
significant tendency to recall the shocked 
face as more vivid than the nonshocked up to 
60 volts, after which level the nonshocked 
face is recalled most vividly. As physical 
intensity of shock increased in training, more 
and more nonshocked faces are recalled as 
more vivid. 

Delayed Testing Analyses. No significant 
trends were found, either as to reporting the 
shocked or the nonshocked profiles, with delay 
in testing. The escape group tended slightly 
to report more shocked profiles following delay 
in testing, whereas the no-escape group tended 
slightly to report more nonshocked profiles. A 
few Ss in each group showed marked shifts 
in perceptual reports during the second week 
after the training. However, these dramatic 
shifts in reporting either the shocked or non- 
shocked profiles seem to reflect individual 
differences rather than kind of training. 


DISCUSSION 

A brief review of the major results is in order. 
We found that reporting of the nonshocked 
face increased as a function of increasing 
severity of shock. When Ss could escape from 
the electric shock they were more likely to 
“emphasize” the shock-associated percept than 
when they could not escape from the shock. 
There was a strong, positive correlation be- 
tween judged unpleasantness of the shock 
and the physical voltage administered for the 
escape conditions, but no significant correla- 
tion for the no-escape condition. The shock- 
associated profile was recalled as most vivid 
by both escape and no-escape groups, this 
effect decreasing with increasing amounts of 
shock. With delayed testing finally, there 
were dramatic shifts toward reporting more 
nonshocked faces approximately in the second 
week after training, this trend being largely 
a function of individual differences rather than 
training conditions. 

Certain weaknesses in our studies should 
be recognized at the outset. It would of course 
have been highly desirable to repeat all of 
our studies in the form of one multifactorial 
design. Another problem arises from the fact 


that the escape group reduced the duration 
of shock by their escape, so that the duration/ 
intensity relationship was not the same for 
escape and no-escape conditions. Finally, and 
perhaps more serious, the question can rea- 
sonably be asked: “Is this perceptual learn- 
ing?” It is clear that our design and data do 
not guarantee that we were changing “per- 
ception” rather than verbal habits. We can 
only point out that the consistency of our 
findings suggests to us that something more 
than mere verbal habits was altered, and there 
are enough indirect hints to suggest that “‘per- 
ception” of the tactual plaques changed as a 
function of the punishment association. 

Our data agree with Smith and Hochberg 
(11), Dulany (4), Pustell (7), and Reece (10) 
in that severe punishment, i.e., electric shock, 
disrupts or interferes with perceptual organi- 
zation, especially when the S cannot escape 
from the punishing event. We have gone fur- 
ther and have shown that with mild shock 
and escape conditions an “alerting” effect is 
obtained in that the punished percept is em- 
phasized in a tactual figure-ground organiza- 
tion. Indeed, at all levels of shock, escape seems 
to make the S more alert to the associated 
percept than does no-escape. One is much 
more likely to obtain an “autistic” organiza- 
tion (5) when the S cannot overtly escape 
from the shock than when he can. 

Our data also shed light on the problem of 
whether to use subjective evaluations of un- 
pleasantness of shock or to use physical meas- 
ures of shock. Subjective evaluations and phys- 
ical measures seem to be interchangeable, 
functionally, under escape conditions. There 
is such a strong, positive correlation between 
the two in the escape condition that they must 
be reflecting the same psychological phenom- 
enon. In the no-escape condition, however, 
there is virtually no correlation between the 
two and they are not functionally interchange- 
able. It is highly possible that in the no-escape 
condition, Ss are rating the éo/al unpleasant- 
ness of the situation, including the unpleas- 
antness of being strapped into the elastic glove. 
There are other complications, too, in that 
adaptation to the shock is more probable in 
the no-escape condition than in the escape 
condition. 

Our use of the concept “repression” to make 
predictions concerning delayed testing results 
warrants some comment. Dollard and Miller 
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(3) have pointed out that perceptual cues can 
acquire reinforcing properties through past 
association with punishment and produce be- 
havior similar to that elicited by a primary 
negative stirhulus. In our study, the percept 
associated with shock should thus acquire 
secondary aversive properties, which might 
appear upon delayed testing. Rapaport (8) 
has emphasized that anxiety associated with 
percepts increases after an unpleasant experi- 
ence and that these percepts become “re- 
pressed,”’ the repression increasing with the 
passage of time. The term “repression” is 
used here in a purely descriptive sense. Our 
results indicate that “repression” does not 
immediately develop following the shock but 
sets in about two weeks afterwards. The mech- 
anisms underlying repression of the shock- 
associated percept remain to be clarified. 


SUMMARY 


Three experiments are reported in which 
tactual profiles of faces were associated with 
electric shock of either 18, 25, 40, 50, 60, or 75 
volts administered to the unpreferred hand, 
not used in tracing the plaque. In Experiments 
I and III the shock was temporally contiguous 
with the traced profile, and in Experiment II 
it was given three seconds afterwards. In some 
conditions, the subjects were allowed to escape 
from shock and in others they were not. Imme- 
diately following the training phase, in which 
the shock was associated with one tactual 
profile and no-shock with the alternative pro- 
file, the subject was blindfolded and traced an 
“ambiguous” line (which was the identical 
contour for both the shocked and the non- 
shocked profiles), reporting which profile was 
being traced. Subjects also rated the unpleas- 
antness of the electric shock and reported 
which face they remembered most vividly. 

The data were analyzed with the following 
results: 

1. As intensity of shock increases, there is 
more and more reporting of the nonshocked 
profile in the test situation. 

2. Escape conditions lead to more reporting 
of the shocked profiles than do the no-escape 


conditions, i.e., escape is more likely to lead to 
emphasis of the shocked profiles in the tactual 
figure-ground organization than is no-escape. 

3. For nearly all levels of shock, except 60 
and 75 volts, the shocked profile is recalled 
most vividly. 

4. There is a strong, positive correlation be- 
tween subjective evaluation of the unpleasant- 
ness of the shock and the physical value of the 
shock in the escape condition, but there is 
little or no correlation between these two vari- 
ables in the no-escape condition. 

5. When testing with the ambiguous line is 
delayed (up to three weeks), there is more and 
more reporting of the nonshocked profile. This 
effect was not significant overall but was sig- 
nificant for some individuals. 
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INTRALIST SIMILARITY AND VERBAL ROTE LEARNING 
PERFORMANCE OF SCHIZOPHRENIC AND CORTICALLY 
DAMAGED PATIENTS'! 


ROBERT C. CARSON? 
Northwestern University 


HE problems with which this study is 

concerned are rooted in controversies 

regarding the performance of schizo- 
phrenic and brain-damaged individuals on 
certain intellectual tasks purporting to meas- 
ure “abstracting” ability. On such tasks, 
organics and schizophrenics perform differ- 
ently from normals and on an inferior level. 
Clinicians generally agree that organics do so 
because they are affected with what Goldstein 
(6, 8) has termed a loss of the abstract atti- 
tude; they cannot perceive similar or common 
elements, and every object becomes a concrete, 
unique entity divorced from a class. Less 
agreement exists as to the nature of the differ- 
ence between schizophrenic and normal per- 
formance on this type of task. It is probable, 
however, that a majority of clinicians favor 
the view expressed by Cameron (1, 2), who 
conceives the schizophrenic as a person whose 
classes are too broad and inclusive owing to a 
functional thought disturbance. 

One implication of these views is that 
organic patients suffer from an inability to 
make a common response to stimuli which are 
similar, while schizophrenic patients suffer 
from the opposite deficiency of being unable 
to make differential responses to stimuli 
which are different. Some recent investigators 
have reasoned from this that the concept of 
stimulus generalization (which refers to the 
process whereby a response regularly elicited, 
through training, by one stimulus tends to be 
elicited without prior experience by other, simi- 
lar stimuli) might be useful in providing a 
more precise and potentially productive ex- 
planation of this type of pathological behavior 
than has heretofore been achieved. If this 


1 This report is based on a dissertation submitted to 
the Graduate School of Northwestern University in 
partial fulfillment of the requirements for the degree of 
doctor of philosophy. Thanks are due to Janet A. 
Taylor, dissertation adviser, for her generous help and 
encouragement during the investigation. 

2 Now at Department of Psychiatry, University of 
Chicago School of Medicine. 


argument is sound, it implies that organics 
should exhibit less stimulus generalization (SG) 
than normals, while schizophrenics should 
exhibit more SG than normals. Garmezy (4) 
and Mednick (14) have both presented results 
indicating that in a simple S-R situation 
schizophrenics do in fact exhibit more SG 
than normals, although in the latter study the 
findings were not definitive. In a second por- 
tion of his study, Mednick found impressive 
confirmation of the hypothesis that cortically 
damaged organics exhibit less SG than do 
normals. 

The results suggest, then, that distortion in 
the gradient of SG may be one determinant 
of the intellectual disturbances observed in 
schizophrenia and organic brain damage. It is 
the purpose of the present study to examine 
the performance of schizophrenics and corti- 
cally damaged organics in a verbal rote 
learning task, attempting to predict certain 
aspects of their performance by means of 
these notions about SG. 

Modern interpretations of the nature of 
verbal rote learning are generally based on the 
formulations of Gibson (5). Typically, the list 
of items consists of a set of stimuli and a set 
of responses to be correctly associated with 
these stimuli. To the extent that the stimuli 
are similar (formally or in meaning), the re- 
sponse to one stimulus tends to yeneralize to 
(be evoked by) other stimuli in the list in pro- 
portion to the degree of similiarity between 
them. This generalization results in the occa- 
sional blocking of correct responses by inter- 
ference from other responses with a consequent 
retardation of performance. Obviously, the 
extent to which generalization tendencies of 
this type develop (and hence, other things 
being equal, the difficulty of the list) is a func- 
tion of intralist stimulus similarity. This 
interpretation of the verbal learning process 
has held up well experimentally and many 
observations are integrated by means of 
Gibson’s conceptual framework (19). 
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From what has been said regarding SG in 
schizophrenic and cortically damaged patients, 
it follows that these two groups should, on 
a verbal rote learning task, evidence differen- 
tial rates of increase in difficulty with in- 
creasing intralist similarity: This rate of 
increase should be relatively high for schizo- 
phrenics, who show much SG; and it should 
be relatively low for cortically damaged 
organics, who show little SG. Specifically, the 
hypotheses of this study are as follows: 

(a) As intralist similarity increases, schizo- 
phrenics exhibit a more pronounced rate of 
increase in difficulty than a normal control 
group. 

(6) As intralist similarity increases, corti- 
cally damaged organics exhibit a less pro- 
nounced rate of increase in difficulty than a 
normal control group. 

It should be noted that nothing is said here 
about relative levels of performance. Only 
differences in the slope of the performance 
curve when intralist similarity is varied, or an 
interaction between lists varying in degree 
of intralist similarity and neuropsychiatric 


condition (organic-schizophrenic-normal) of 
the Ss, are predicted. 


METHOD 
Subjects 

Three groups of male Ss, distinguished by neuro- 
psychiatric condition, were employed. A schizophrenic 
(Sc) group, a cortically damaged organic (O) group, 
and a neuropsychiatrically normal (N) group, each 
containing 45 Ss, were selected on the basis of diag- 
nostic and other criteria described below from the 
current patient populations of Veterans Administra- 
tion hospitals in the Chicago area. Suspected mental 
defectives, persons having serious uncorrected visual 
defects, and persons having less than eight years of 
forma] education were excluded. 

Schizophrenics. The Sc group was composed of pa- 
tients having a well-established diagnosis of schizo- 
phrenic reaction. The diagnosis was considered well- 
established when there was no evidence in the case 
folder of any disagreement in arriving at it. An attempt 
was made to include only relatively early chronic cases. 
Selection was limited to patients within the 20—50-yr. 
age range whose first psychiatric hospital admission 
occurred not less than 6 mos. nor more than 10 yrs. 
previously. No patient was included who had had more 
than 35 electroconvulsive shock or insulin coma treat- 
ments, or any assaultive somatic therapy within 3 mos. 
prior to participation in the study. In addition, no 
patient was included in whose case folder there was any 
suggestion of the possible presence of organic brain 
pathology or chronic alcoholism. For the entire Sc 
group, the mean age was 32.84 yrs.; the mean amount 
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of formal education completed was 12.02 yrs., and the 
mean WAIS vocabulary score was 41.78. 

Organics. Only patients with a well-established 
diagnosis of cortical brain damage were included in the 
O group. This diagnosis was considered well-established 
in cases where there had been brain surgery, or where 
definite neurological] findings were substantiated by at 
least one of the common ancillary methods of neuro- 
logical diagnosis (e.g., electroencephalography). It was 
originally hoped that the 20-50-yr. age criterion could 
be observed with reference to this group, but this 
proved impossible because of sheer lack of availability. 
The age range of the O group was 24-77 yrs., with a 
mean of 55.18 yrs. Any patient whose case folder had 
in it material which suggested that he was currently a 
serious psychiatric problem or that he had ever in the 
past been treated for a problem of this kind was ex- 
cluded. Also excluded were patients exhibiting aphasic 
symptomatology. The O group had a mean WAIS 
vocabulary score of 36.09, and had completed, on the 
average, 9.64 yrs. of formal education. 

Normals. The N group was composed of general 
medical and surgical patients who had received neuro- 
psychiatric clearance from the physicians having re- 
sponsibility for their cases. No patient who was known 
to have been treated for a psychiatric illness or for an 
intracranial neurological condition was included. Also 
excluded were known chronic alcoholics. The age range 
of the N group was 20-50 yrs., the mean being 35.69 
yrs. The means for WAIS vocabulary score and amount 
of formal education were 36.71 and 10.62 yrs. respec- 
tively. 


Materials 


Three serial word lists of nine adjectives each, shown 
in Table 1, were constructed, using only words which 
could be assumed to be familiar to the Ss. The average 
familiarity values of the three lists, as measured by 
Thorndike-Lorge (18) word counts, were approximately 
equal. The three lists differed in degree of intralist 
similarity of meaning. Roget’s Thesaurus (15) was 
used as a reference for deciding similarity of meaning. 

The low similarity list (I) consisted of nine adjec- 
tives which appeared to be quite dissimilar to each 
other in meaning, sound, and formal characteristics. 
The medium similarity list (II) consisted of four 
groups (three with two members and one with three 
members) of adjectives, the members of each group 
being similar in meaning to each other but dissimilar 
in meaning to the members of the other groups. The 
high similarity list (IIT) consisted of two groups (one 
with five members and one with four) of adjectives, 
each containing words selected on the basis of their 
similarity of meaning. Intralist similarity in dimen- 
sions other than meaning was minimized as much as 
possible in the construction of both this and the me- 
dium similarity list. A practice list, also reproduced in 
Table 1, consisting of seven dissimilar three-letter 
nouns was also employed. The arrangement of items 
within each of the lists was unsystematic. 

All lists were printed in India ink on endless white 
tapes in upper-case letters approximately }4-in. high. 
They were presented by means of a Hull-type memory 
drum. To indicate the beginning of the list, a starter 
item (000) was printed at the top of each. There was a 
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TABLE 1 
EXPERIMENTAL MATERIALS 


Experimental lists 





III 
Quick 
Merry 
Joyful 
Fast 
Gay 
Rapid 
Swift 
Happy 
Speedy 





2-sec. rate of exposure of items with an 8-sec. intertrial 
interval (12 sec. in the case of the practice list). 


Procedure 


Each of the three main groups (Sc, O; and N) were 
subdivided into three subgroups, each containing 15 
Ss. Each of the three subgroups within a main group 
learned a different experimental list of serial adjec- 
tives, described above. Certain performance criteria, to 
be outlined later, were established to determine the 
eligibility of Ss to serve in the experiment. Subjects 
continued to be tested until 15 usable Ss were obtained 
for each of the subgroups. Formation of subgroups 
was accomplished by alternate assignment of Ss as 
they appeared to be tested. An exception to this gen- 
eral rule was made where there was clear evidence that 
this procedure would result in excessive variation 
among subgroups in practice list performance. Assign- 
ment of Ss was then manipulated to obtain subgroups 
matched on practice list performance. There were, then, 
nine subgroups of 15 Ss each, each subgroup represent- 
ing a different combination of neuropsychiatric condi- 
tion and experimental list learned. Age, education, and 
vocabulary score differences among subgroups within 
any main group were negligible. 

After preliminary instructions, each S was given 10 
learning trials on the practice list in order (a) to famil- 
iarize Ss with and afford them some practice in verbal 
rote learning of this type, (0) to aid in further selecting 
Ss by providing a basis for excluding those who were 
not likely, for whatever reason, to have any success in 
performing with the experimental lists, and (c) to 
afford a method of equating the three subgroups within 
each main group. If S failed to respond on the second 
practice list trial, E said, “Go ahead.” Any S requiring 
help beyond this was excluded from the study. Only 
those Ss who were unable on the 10th trial or before 
to produce in their correct sequential placement four 
of the seven words of the practice list were excluded 
on the basis of slowness of learning. Inadequate prac- 
tice list performance resulted in the exclusion of 5 
Normals, 13 Schizophrenics, and 10 Organics. Ss who 
succeeded in rendering two consecutive perfect recita- 
tions of the practice list before the 10th trial were 
given credit for their remaining trials and allowed 
to proceed to the next list. 

Approximately 1 min. after completion of the prac- 
tice list trials, each S meeting the selection criteria was 
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given 20 trials on the experimental list appropriate for 
the subgroup of which he was a member. Ss who suc- 
ceeded in rendering two consecutive perfect recitations 
of the experimental list within the 20 trials were dis- 
continued at that point and given credit for perfect 
performance on their remaining trials. Following per- 
formance on the experimental list, every S was given 
the Vocabulary subtest of the Wechsler Adult Intelli- 
gence Scale to provide a rough index of his intellectual 
level. 


RESULTS 


Practice list. The data of the experiment 
were organized in terms of the number of cor- 
rect anticipations made by S in the allotted 
number of trials. The practice list data showed 
considerable variation in performance among 
the three main groups, but the performances 
of the subgroups within each main group were 
substantially equivalent. Appropriate analyses 
of variance revealed the following: (a) the 
practice list performance of the N group was, 
on the whole, significantly superior to that of 
the O group (p < .001) and of the Sc group (p 
<.01), and (6) as expected, there were no 
significant differences among the practice list 
performances of subgroups differentiated ac- 
cording to the experimental lists subsequently 
learned, thus permitting analysis of the effects 
of the experimental lists. 

Experimental lists. Learning curves prepared 
separately for each of the nine subgroups 
indicated that the relative performance levels 
of subgroups remained fairly stable through- 
out the course of the 20 learning trials on the 
experimental lists. Accordingly, only the data 
for performance over the entire 20 trials were 
analyzed. The mean number of correct antici- 
pation made on the appropriate experimental 
list was computed for each subgroup; these 
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means are presented graphically in Fig. 1. 
Since practice and experimental list perform- 
ance were found to be correlated (r = .59 for 
all 135 Ss), analysis of covariance was applied 
where appropriate in the statistical evaluation 
of the experimental list data. Tests for homo- 
geneity of variance and for homogeneity of 
regression of experimental on practice list 
scores failed to show significant heterogeneity 
of either type among the subgroups. 

Organics vs. normals. It was hypothesized 
that, as intralist similarity increases, the O 
group exhibits a less pronounced rate of in- 
crease in difficulty than the N control group. 
The hypothesis seems to be supported by the 
results shown in Fig. 1. Contrary to expecta- 
tions, however, an analysis of covariance per- 
formed on the O and N group data showed the 
interaction term for neuropsychiatric (NP) 
conditions X lists to be short of statistical 
significance, F having a probability of approx- 
imately .15. The list effects and the NP condi- 
tion effects, taken independently, were both 
highly significant (p <.005 in both cases). In 
view of the use of analysis of covariance pro- 
cedures, the significant difference between the 
O and N condition is of special interest be- 
cause it represents a difference between the O 
and N groups in performance on the experi- 
mental lists after their scores on these lists 
have been adjusted to equality on the basis of 
practice list performance. This means in effect 
that the difference in difficulty between the 
practice and experimental lists was generally 
greater for the O group than for the N group, 
even though the coefficient of regression of 
experimental on practice list scores was not 
significantly different for the two groups. 

Failure to find a significant NP conditions 
< lists interaction represents, of course, a 
failure to obtain support for the hypothesis. 
Nevertheless, it must be pointed out that the 
interaction term, while not reaching a conven- 
tionally acceptable significance level, does 
approach it rather closely. The obtaining of a 
significant interaction term in a statistical 
test, particularly where, as in the present case, 
the test is made on relatively small samples 
from highly variable populations, is notoriously 
difficult. A supplementary method of testing 
the hypothesis was therefore employed. 

The experimental list data were grouped 
separately for the N and O conditions, and a 
simple analysis of variance was performed to 
test list effects in each condition. Summaries of 
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these analyses appear in Table 2. For the N 
group, list effects are highly significant (p 
<.01), while for the O group, the list effects 
do not approach a statistically significant 
level, F being less than 1.00. Thus, it would 
appear that for the O group, difficulty does not 
vary significantly with changes in intralist 
similarity, whereas for the N group there is a 
highly significant increase in difficulty with 
increasing intralist similarity. 

While these analyses appear to confirm the 
hypothesis of less generalization in the O 
than in the N group, it could be argued that 
the lack of difference among the O subgroups 
is due to their general and pronounced in- 
feriority; Ss learning the medium and high 
similarity lists could not, according to this 
interpretation, be much worse than those in 
low similarity group. In order to test this 
possibility, the five Ss having the lowest 
experimental list scores in each of the three N 
subgroups, and the five Ss having the highest 
experimental list scores in each of the three O 
subgroups were selected for special study. 
Practice list scores for these Ss were subjected 
to analyses of variance, in which no significant 
differences were found between subgroups 
within either the high O or low N groups. The 
mean number of correct anticipations on the 
appropriate experimental list was computed for 
each of the six selected subgroups, and inspec- 
tion revealed that, although the high Os and low 
Ns were roughly equal with respect to general 
level of performance, the means of the Ns 
decreased in regular fashion as intralist simi- 
larity increased, while those of the Os did nor. 
This impression was confirmed by analyses of 
variance performed on these data: for the low 
Ns, list effects were highly significant (p 
<.001), whereas for the high Os, list effects 
were not significant, F again being less than 
1.00. Thus, the general O and N group findings 
were duplicated in this miniature analysis 
employing selected Ss. One may conclude, 
then, that the hypothesis is essentially sup- 
ported. As meaningful intralist similarity in- 
creases, cortically damaged organics exhibit 
a less pronounced rate of increase in difficulty 
than do neuropsychiatrically normal Ss. In- 
deed, the results suggest that the organics 
exhibit mo significant increase in difficulty 
with increasing intralist similarity. This effect, 
moreover, appears to be independent of the 
general level of performance. 
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TABLE 2 
SUMMARIES OF ANALYSES OF VARIANCE OF EXPERI- 
MENTAL List ScorEsS FOR NORMAL, ORGANIC, 
AND SCHIZOPHRENIC GROUPS 








variation square 
| 





| 
| Source of f Mean 


Normals 


Between sub- | 3811.82 


groups (lists) | | 
Within sub- | | 630.48 
groups 


| Total 


Organics Between sub- 
| groups (lists) | 
Within sub- 
| groups 
Total 


Schizophre- | Between sub- | 
nics groups (lists) | 
| Within sub- 
| groups 
| Total 











Schizophrenics vs. normals. It is apparent 
from Fig. 1 that the performance of the Sc 
group on the experimental lists unexpectedly 
approximates that of the N group. Only on 
List II did the difference appear very great, 
and even here it was not statistically signif- 
icant. An analysis of variance performed on 
the experimental list data for the N and Sc 
group showed no significant NP condition 
effects (p >.10). The N and Sc groups there- 
fore cannot be regarded as differing with 
respect to level of performance on the experi- 
mental lists. 

The hypothesis stated that as intralist simi- 
larity increases, the Sc group exhibits a more 
pronounced rate of increase in difficulty than 
N control Ss. An analysis of covariance 
showed the F for the NP conditions X lists 
interaction to be far below an acceptable level 
of significance (p >.20). Moreover, it is clear 
from inspection of the data and from the sup- 
plementary analysis described below that any 
interaction occurring here was in a direction 
opposite to that predicted. As expected, the F 
for NP conditions was not significant, whereas 
that for lists was highly significant (p <.001). 

It will be recalled that when the data for the 
N group alone were subjected to analysis of 
variance, the list effects were found to be 
highly significant (Table 2). The implication 
of the hypothesis is that list effects should be 
greater in the Sc than in the N group. A simple 
analysis of variance, summarized in Table 2, 
shows no significant list effects (p >.05). 


Consequently, the hypothesis seems untenable. 
The Sc group did not show a greater than 
normal increase in difficulty with increasing 
intralist similarity; like the O group, it showed 
none. 


DISCUSSION 


Organics. The O group results confirm the 
hypothesis with respect to that group, thus 
supporting the theoretical framework from 
which it was derived. The pathologically 
lowered gradient of SG, which may be a strong 
factor in limiting the performance of organics 
on many intellectual tasks (e.g., those con- 
cerned with concept formation), is actually 
conceived here as an aid to performance. Fur- 
thermore, if the views presented here are 
correct, cortically damaged Ss should have a 
similar advantage on any complex task in 
which, because of SG, interference from incor- 
rect competing responses is a major deterrent 
to successful performance. It is not suggested 
that they will perform better than normals, for 
SG is obviously not the only process altered 
in cortical injury. It may be hypothesized, 
however, that performance differences between 
normal and cortically damaged Ss on a given 
task will become less as the opportunity is 
increased for SG-produced interference to 
occur. 

It has been shown that the empirical gra- 
dient of SG is positively related to such factors 
as the habit strength of the original S-R 
association and to the drive level currently 
operative (9). Mednick (14) had the impres- 
sion that his organic group was generally low 
in drive. On the assumption that amount of 
SG exhibited varies directly with drive level, 
he suggested that the lowered gradient of SG 
in cortically damaged Ss may be due to a 
pathologically diminished level of drive. In the 
present study, E had no such impression con- 
cerning the drive level of the O group. More- 
over, the analysis of the equated groups 
challenges a drive interpretation. Os and Ns 
who were approximately equal in general 
level of performance were selected for study. 
The findings could issue from differences be- 
tween the N and O groups in learning, in 
drive, or both. If either of these factors were 
responsible, however, equating groups, as was 
done, would be expected to result in the 
disappearance of the effect in question. No 
such result was found. 

In the absence of more reliable evidence, 
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observations made here favor the hypothesis 
that the cortically injured individual is defi- 
cient in some kind of basic generalizing ability. 
This is, of course, essentially in keeping with 
Goldstein’s original notions about the effects 
of brain injury, although the issues are prob- 
ably more complex than he conceived them to 
be. In any case, the results suggest that the 
concept of SG may offer an explanatory prin- 
ciple of wider generality than Goldstein’s 
abstract-concrete dichotomy. 

As noted above, the statistical analysis 
indicated, unexpectedly, that the difference 
in difficulty between the practice and experi- 
mental lists was greater for the O than for the 
N group. Undoubtedly, the experimental lists, 
because of their length and composition, are 
on the whole more productive of interference 
than the practice list. But the results indicate 
that cortically damaged Ss are less susceptible 
to such interference than are normal Ss; ac- 
cordingly, an interference hypothesis does not 
afford an adequate explanation. The explana- 
tion probably lies in the common observa- 
tion that brain-damaged individuals often 
show a decrement in performance with sus- 
tained effort. Such an effect was probably 
operative in the O group as they moved from 
the practice through the experimental lists, 
thus accounting for their disproportionately 
poorer performance on the latter. 

Schizophrenics. It was hypothesized that 
with increasing intralist similarity, the Sc 
group would show progressively poorer per- 
formance than the N group. Actually, the 
performance of the Sc group did not vary 
significantly with intralist similarity. The 
results obtained for the Sc group are more in 
accord with the views held by Vigotsky (20), 
Goldstein (7), and Kasanin and Hanfmann 
(11), according to which the primary psycho- 
logical disturbance in schizophrenia is a con- 
cretization of mental processes or, in other 
words, a deficiency in the ability to abstract 
and conceptualize. These investigators believe 
that the schizophrenic, like the organic, is more 
or less incapable of appreciating similarity 
among different stimuli. The present results 
are consistent with such an hypothesis. How- 
ever, the findings on SG in schizophrenia 
engender a dilemma: if the schizophrenic is 
indeed unable to appreciate similarity among 
different stimuli, he should exhibit less SG 
than the normal S; two independent investiga- 
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tions (4, 14) indicate that quite the opposite is 
true. Furthermore, the character of the obser- 
vations which led to the application of the 
concretization hypothesis to schizophrenia is 
highly questionable, having been almost exclu- 
sively of a qualitative and uncontrolled type. 
Recent investigations of schizophrenic concep- 
tual performance, in which more rigorous and 
sophisticated experimental procedures were 
employed, have failed to confirm the concreti- 
zation notion (3, 12, 13). One is therefore 
extremely reluctant to conclude that the 
results reflect a concretization of mental 
processes or a lack of generalizing ability in 
the Sc group. 

Nor do the findings seem accountable on the 
basis of any systematic knowledge concerning 
the gradient of SG. The high performance 
level of the Sc group all but rules out the 
possibility of a decrement either in habit 
strength or in drive. There remains only the 
rather uncomfortable position of having to 
suggest that the findings may be due to the 
influence of some uncontrolled artifact oper- 
ating in connection with the Sc group. One 
possibility which comes immediately to mind 
is that of an uncontrolled medication effect. 
A considerable majority of the patients in the 
Sc group were under the influence of chlorpro- 
mazine at the time of testing. It is not incon- 
ceivable that such medication may depress 
the gradient of SG, and this would account 
for the ineffectiveness of the intralist similarity 
variable in the performance of the Sc group. 

It has been noted previously that the general 
level of performance of the Sc group on the 
experimental lists was not significantly infe- 
rior to that of the N group. This finding runs 
counter to much experimental data showing 
schizophrenics almost without exception to be 
inferior to normals on intellectual tasks (10, 
16). The explanation may lie, at least in part, 
in the fact that the Sc group was somewhat 
more intelligent and better educated than the 
N group. These differences may be a function 
of variation in the patient populations of the 
hospitals at which the Ss were obtained. 

On the practice list, the Scs conformed to 
expectations in performing at a significantly 
lower level than the Ns. This raises the ques- 
tion of why there should bea difference between 
the two groups on the practice list but no 
significant difference between them on the 
experimental lists. Two explanations seem 
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plausible: (a) the practice list was more diffi- 
cult for the Sc group than for the N group 
due, as some evidence (17) would suggest, to 
the former’s slowness in adjusting to a novel 
situation; or (6) assuming a genuine superiority 
in the N group, the experimental lists were 
more difficult for the N than for the Sc group 
due to the N group’s greater susceptibility to 
interference. There seems to be no reasonable 
basis for deciding between these alternative 
explanations. 

Finally, a word is perhaps in order con- 
cerning the possibility of relevant but 
uncontrolled variables being operative in con- 
nection with the composition of the experi- 
mental and control groups. The most con- 
spicuous problem is the discrepancy in mean 
age between the N and O groups, the O group 
being on the whoie considerably older. While 
it is probable that the most significant changes 
accompanying advanced age are in the direc- 
tion of increasing organic deterioration, there 
is the very remote possibility that the O 
group results are more a function of age than 
of organic pathology per se. Some further 
difficulties of a similar nature aise in connec- 
tion with differences among groups in intelli- 
gence, education, and length of hospitaliza- 
tion. In general, Scs and Os were long-term, 
chronically hospitalized cases, whereas Ns 
tended to be acute cases only recently hospi- 
talized. Again here, however, the possibility 
of influence from these factors on the main 
findings is considered remote. 


SUMMARY 


Experimental work growing out of earlier 
notions about conceptual ability in schizo- 


phrenic and brain-injured individuals has 
shown distortions of the gradient of stimulus 
generalization in these two pathological groups. 
Stated simply, it has been shown that schizo- 
phrenics exhibit more stimulus generalization 
than normals, while persons with cortical 
brain damage exhibit less stimulus generaliza- 
tion than normals. These findings may help 
to explain certain anomalies in the intellectual 
performance of schizophrenic and brain-injured 
individuals. 

The findings on stimulus generalization in 
these pathological conditions, in view of cur- 
rent theoretical interpretations of the role of 
stimulus generalization in producing the rela- 
tionship between intralist similarity and dif- 
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ficulty of verbal rote learning, led to the 
adoption of the following hypotheses. With 
increasing intralist similarity in the material 
learned, (a) schizophrenics exhibit a more 
pronounced rate of increase in difficulty than 
a normal control group, and (5) cortically 
damaged organics exhibit a less pronounced 
rate of increase in difficulty than a normal 
control group. 

Three groups of hospitalized patients, c.ag- 
nosed respectively as schizophrenic, cortically 
injured, and neuropsychiatrically normal, were 
each divided into three equated subgroups. 
Each subgroup within a main group learned 
one of three serial adjective lists which differed 
from each other in degree of intralist similarity 
of meaning. Thus, there were nine subgroups, 
each representing a different combination of 
neuropsychiatric condition and degree of intra- 
list similarity in the experimental list learned. 

Analysis of results supported Hypothesis 
(6). Hypothesis (a) was unconfirmed; the 
results for the schizophrenic group were 
opposite to those predicted by the hypothesis. 
Like the organics, they showed a less than 
normal increase in difficulty with increasing 
intralist similarity. However, various con- 
siderations prevent the drawing of any con- 
clusions from this finding. 
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THE INTELLIGENCE TEST PERFORMANCE OF MAORI CHILDREN: 
A CROSS-CULTURAL STUDY 


RICHARD H. WALTERS! 
University of Toronto 


the measurement of ability by psycho- 

logical tests for the prediction of aca- 
demic success. In a community like New 
Zealand where the Maori form a considerable 
minority group of varying degrees of accultura- 
tion and where, for the most part, Poly- 
nesians share the same educational oppor- 
tunities and advantages as Europeans, the 
routine testing of mixed groups may bring 
about misleading results. The investigation 
reported in this paper was carried out to assess 
the performance of Maori pupils on a variety 
of tests, verbal and nonverbal, of a kind cus- 
tomarily used for the assessment of intelli- 
gence among Euro-American populations. 


Bite have come to rely heavily on 


THE TESTING PROGRAM 


During the winter months of 1953, testing 
was carried out in Northern New Zealand 
with schoolchildren between the ages of 11.0 
and 15.11. Two test batteries were used: (a) 
the Science Research Associates’ shortened 
form of Thurstone’s Tests of Primary Mental 
Abilities (PMA); and (6) a “Non-Language 
Test” (NLT) comprised of items from existing 
test batteries with modifications designed to 
reduce the effects of differing cultural ex- 
periences. 

Maori testees were taken from three general 
sources: (a) from schools in the vicinity of 
Auckland under the supervision of the local 
education authority, with the addition of a 
small number of cases from the town of 
Whangerei (city group); (6) from semi-rural 
areas bordering on Auckland in which the 
Maori works as a laborer in market gardens 
and lives under extremely adverse social con- 
ditions (market-garden group); (c) from 
country schools under the supervision of the 
Department of Maori Schools (country 
group). Within the country group was in- 
cluded a small proportion of pupils from 
boarding schools in and near Auckland, since 


1 This research was carried out while the author was 
on the faculty of Auckland University College, New 
Zealand. 
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by far the greater number of these Maori pupils 
come to the boarding schools around the age of 
13 or 14 from Maori primary schools in pre- 
dominantly Maori areas and so are lost to the 
Maori District High Schools. P 

The control group consisted of children of 
European descent taken almost exclusively 
from the public schools of Auckland, the few 
exceptions being pupils of European descent 
who happened to be attending Maori schools 
for geographical reasons and a few pupils from 
Whangerei High School who were tested at the 
same time as the Whangerei Maori group. The 
selection of a group of pupils primarily from 
city schools to serve as a control group may 
have put the Maoris at a disadvantage, since 
there is a tendency for city pupils in New 
Zealand to score higher on intelligence tests 
than do country pupils (2). The selection of 
such a group was, however, necessary on ac- 
count of limitations of time and finance. 

The cooperation of school principals was 
enlisted for the administration of the tests. In 
addition, principals or classroom teachers were 
asked to rate each Maori pupil on each of five 
three-point scales designed to give a rough 
measure of variables that might be related to 
test scores: school attendance, continuity of 
education, conversational verbal fluency, de- 
gree of acculturation, and educational progress. 

A total of over 900 Maori and 600 control 
children were tested. There is some doubt con- 
cerning the representativeness of the sample 
of 15-year-olds, since in New Zealand a pupil 
may leave school as soon as he has attained his 
fifteenth birthday. Throughout the 11-, 12-, 
13-, and 14-year-old groups, however, ade- 
quate samples were obtained both of Maori 
and of control children, and results from one age 
group to another were remarkably consistent. 
Consequently, this paper is based primarily on 
the results for the 13-year-old group only. 


Description of the Tests 


As is well known, the PMA test consists of 
six subtests: Verbal Meaning, Space, Reason- 
ing, Number and Word Fluency, corresponding 
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to Thurstone’s Factors V, S, R, N, and W. The 
PMA test was administered in a standard 
manner. However, because of the difficulty 
some Maori children appeared to find in the 
use of the answer pad, the form of test was 
modified to allow answers to be written beside 
the test item instead of being recorded on an 
answer pad. This modified form was used with 
pupils who did not reside in urban areas, but 
the standard answer pad form was used with 
pupils in the city schools who are more ac- 
customed to testing procedures. An unpub- 
lished study by Leone Smith of Auckland Uni- 
versity College has shown that the use of the 
modified PMA test booklet does not signifi- 
cantly change scores of Maori pupils, though 
it does significantly raise the scores of children 
of European descent. Since almost all the 
control group was tested with the standard 
answer pad form, it is unlikely that results 
have been seriously affected. The use of the 
modified form in country and semi-rural areas 
undoubtedly saved many spoiled test records. 

The non-language battery consisted of six 
subtests: a Substitutions test which was es- 
sentially the Digit Symbol Test of the Wechsler 
Intelligence Scale for Children, Picture Com- 
pletion and Picture Arrangement tests based 
primarily on the Wechsler-Bellevue and WISC 
items, a Block Counting test consisting of items 
taken from the Army General Classification 
Test, a Differences test taken from the SRA 
Non-Verbal Intelligence Test, and a Series 
test based on items from the Differential 
Aptitude Tests. All subtests were set up so 
that they could be administered as a group in- 
telligence test with time limits which were set 
after some preliminary experimentation.” 


The Rating Scales 


The five rating scales were each defined by 
three points in such a way that a school prin- 
cipal or classroom teacher might be able very 
quickly to give a reasonably accurate rating on 
the variables under consideration. 


* Copies of the non-language test and of instructions 
given to administrators have been lodged with the 
American Documentation Institute. Order Document 
No. 5558 from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress, Wash- 
ington 25, D. C., remitting in advance $2.00 for micro- 
film or $3.75 for photocopies. Make checks payable to 
Chief, Photoduplication Service, Library of Congress. 
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1. School Attendance: 

Rating 1. Regular attendance. 

Rating 2. Occasional absences; not excessive, but 
sufficient to impose some handicap. 

Rating 3. Attendance very irregular. 

2. Continuity of Education: 

Rating 1. Attendance at one school only during 
school career. Rate 1 also if the only change has been 
from primary to postprimary school. 

Rating 2. Attendance at two, but not more, 
schools either at primary or postprimary level, or at 
both. 

Rating 3. Attendance of three or more schools 
either at primary or at postprimary stage. 

3. Verbal Fluency (Conversational): 

Rating 1. Very fluent; good self-expression. 

Rating 2. Has reasonable command and under- 
standing of English. 

Rating 3. Very littl command of English. 

4. Acculturation: 

Rating 1. Home conditions approximating to 
average European home. Rate 1 whether or not Maori 
is spoken at home provided good English is also spoken 
by parents. 

Rating 2. A good home with reasonable amount of 
European cultural influence. A good deal of Maori 
spoken; English of parents of mediocre quality. 

Rating 3. A poor home with little or no Europeau 
influence. Very little, if any, English spoken. If condi- 
tions are extremely poor, rate 3 even if parents speak 
reasonably good English. 

5. Educational Progress: 

Rating 1. Better than average. 

Rating 2. Average. 

Rating 3. Below average. 


The use of broadly defined categories includ- 
ing evaluative terms was decided upon after 
consultation with teachers who had had ex- 
perience of teaching in Maori schools. It was 
felt that such scales could be more easily, and 
would be more readily, used by teachers than 
ones which were more detailed and more ob- 
jectively based. Since the scales were used by a 
considerable number of teachers, and the 
number of pupils who were rated is very large, 
an adequate rough grouping of pupils on the 
five variables has almost certainly been 
achieved. 


RESULTS 


Table 1 gives the mean scores of the control 
and the three Maori groups on the subtests, 
and also the total test result, for both batteries 
(13-year-old group only). A test of homogeneity 
of variance, using Bartlett’s procedure, was 
carried out for the three Maori groups; the 
control group was then added and a further 
test of homogeneity was made. Whether or 
not heterogeneity was found, an analysis of 
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TABLE 1 
INTELLIGENCE TrEsT ScorES OF 13-YEAR-OLD MAORI AND CONTROL GROUPS 


Primary Mental Abilities 





City Maori 
(N = 48) 


Market-Garden 


Country Maori 
Maori (VN = 36) 


(N = 131) 


Control 


Subtest (N =116) 


SD 


Mean SD Mean SD Mean 
7.93 
11.56 
4.62 
10.68 
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.24 
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Mean 


SD 


Mean 
13.30 
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7.73 
9.49 
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6.47 
56.49 
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variance was made both for the three Maori 
groups alone and for the control and three 
Maori groups together. Wherever the value of 
F was significant, tests of the significance of 
differences between mean scores were carried 
out, using the within variance as the best 
estimate of population variance when the 
variance of the samples was homogeneous and 
the critical ratio formula in cases where 
heterogeneity of variance had been found. Since 
the primary concern is to compare the Maori 
groups with each other, and to compare each 
of them individually with the control group, 
only the results of the ¢ tests are reported here 
(Tables 2 and 3). 

The results for the 13-year-old group are 
fairly typical of what was found at each age 
level. On the PMA, the country Maori does 
almost as well as the control, as far as IQ is 
concerned. On the other hand, the control 
group does significantly better than the city 
or the market-garden Maori, which appears to 
be definitely an intellectually inferior group. 

The pattern of scores within the PMA is of 
especial interest. In general, the control group 
tends to do better than all three of the Maori 
groups on V, R, and §; on the other hand, the 
country Maori performs decidedly better than 


the control group on N, and does equally well 
on W. In fact, the city and country Maori 
groups showed a greater increase in W over 
the ages studied than did the control group, 
so that, after gaining a somewhat lower score 
than the control group on this factor at the 
11-year level, by the 15-year level they obtain 
a score which is significantly higher than that 
of the control group. 

The picture is quite different for the NLT on 
which the Maori might have been expected to 
do relatively better than on the PMA. At all 
age levels, the control group gained a signifi- 
cantly higher total score than any of the Maori 
groups and showed definite superiority on all 
subtests except Subsiitutions. The differ- 
ences between the country and city Maori 
largely disappear; the groups do about equally 
poorly compared to the controls. The market- 
garden group, however, still remains a much 
inferior group in test performance. 

The tests of homogeneity of variance pro- 
vided important additional data. On the PMA, 
there was no case for the 11- through 14-year- 
old groups in which the addition of the control 
group resulted in heterogeneity when the 
variances of the Maori group had been 
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TABLE 2 
SIGNIFICANCE OF MEAN Group DIFFERENCES BETWEEN 
INTELLIGENCE Test ScorEs OF COUNTRY, 
City, AND MARKET-GARDEN Maorr Groups 


Primary Mental Abilities 


| Country vs. City vs. 

| Market-Garden | Market-Garden 
Subtest | | (df=165) | (= 82) 
Diff. t | Dist. | 


Verbal Mean- § : 7.27 . 5.46 | 

ing | 
Space 1.94 5.35 
Reasoning 1.19 1.86 | 0.67 | 
Number 4.61 7.05 | § 2.44 
Word Fluency 0.644 0 6.91 3 6.27 
1Q | 4.60} 1.81 14.03 
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TABLE 3 
SIGNIFICANCE OF MEAN Group DIFFERENCES BETWEEN 
INTELLIGENCE Test ScorES OF CONTROL 
AND Maori Groups 





Primary Menta! Abilities 


Control vs. 
Market-Garden 


Control vs. 
Country 


Control vs. 
City 
Maori Maori 
(df = 245) (df = 162) 


‘ Diff. ‘ 


Subtest (ap 130) 





| Diff. 


4.39) 3.19°* | 
5.12) 1.14 | 
Reasoning 4.10) 4.58°* | 4. 
Number | 4.58°* | 0.28 0.17 | 2. 
Word Fluency 50| 0.99 |—0.86| 0.44 | s. 
1Q 44) 0.72 | 6.04) 2.46° 1s. 


2.60°° | 
4.0°* | 


| 4.37% | 


Verbal Meaning 
Space 





Non-Language Test 


Non-Language Test 





Country vs. | City vs. 
Market-Garden | Market-Garden 
(df = 167) (df = 84) 


Country vs. 
City 
Subtest (@f = 179) 


Diff. iff. itt. | ‘ 


3.73 
2.28° 
3.42°° 


—.01 ds J 1.52 
—.28 : | 1.84 
—1.23 9.19 


Pict. Comp. 
Pict. Arr. 
Total 





Note.—In the preliminary analyses of variance, the F ratios 
for Substitutions, Block Counting, Differences, and Series were 
not significant; consequently, / ratios were not calculated. 

* Significant at the .05S level. 

** Significant at the .10 level. 


homogeneous. With few exceptions, similar 
findings were obtained with the NLT. 
Analysis of ratings was carried out for 697 
Maori children of the 11 through 14 age 
groups for whom scores on both tests and a 
full set of ratings were obtained. For every 
subtest and for the total PMA and NLT 
scores, the pupils of each age group were di- 
vided into two groups as nearly as possible 
equal, according to whether their scores fell 
above or below the approximate median score 
of the distribution. The distributions of teachers’ 
ratings among the high and low scoring groups 
were then compared by means of the chi- 
square test. The relationships of all five rated 
variables and each of the total scores, and 
also those between each of the subtest scores 
and the ratings of acculturation and educa- 
tional progress, were assessed in this manner. 
The findings are given in Tables 4, 5, and 6. 
This method was employed because the chi- 
square values might be summated over the 
age groups to give an approximate indication 
of the degree of association of test scores 
and ratings over the total group of Ss. Actually, 


Control vs. | 
Country | 


Maori 
(df = 256) 


! 
Control vs. | Control vs. 
City | Market-Garden 
i Maori 
(df = 161) 


Subtest pn it3) 





| 


| | 
| - | 
| Diff. t 


Diff.| ¢ 


<-on eat 


5.38°* 
6.81°° 
5.35°° 
3.57°*° 
4.61°° 
7.65°* 





| 
| 
| 
| 


2.72 
4.63 


1.20} 2.63°* 
2.79| 4.56% 
2.01| 3.66%* | 3.26 
1.32] 2.60** | 2.60 
1.85| 2.70°* | 3.51 
9.14) 4.24°* |18.33 


Pict. Comp. 

Pict. Arr. | 
Block Counting 
Differences 

Series 

Total 


3.07] 6.77%* 
1.95| 4.81°* 
1.31] 3.69°* | 
2.76| 5.46°* 
10.37) 6.38°* 


| 


| | 
1.19) sid 





Note.—In the preliminary analysis of variance, the F ratio for 
Substitutions was not significant; consequently, é ratios were not 
calculated. 

* Significant at the .05 level. 

** Significant at the .01 level. 


in the case of the subtests in particular, fluc- 
tuations in the value of chi square from one 
age group to another were so considerable that 
a comparison of total chi-square values would 
be misleading. 

Two of the subtests only, V and W, showed 
a consistently significant relationship with the 
acculturation variable over all four age groups; 
only the Series tests showed a complete ab- 
sence of any significant relationship. Although 
in general the subtests of the NLT tended to 
give fewer significant chi squares, the total 
score obtained by a simple summation of the 
scores of the subtests was just as highly re- 
lated to the acculturation ratings as was the 
total PMA score. 

Two of the PMA subtests, V and R, were 
significantly related to educational progress 
at least at the .01 level for each age group; N 
and Picture Completion were consistently re- 
lated to this variable at least at the .05 level. 
Although the PMA subtests appear indi- 
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TABLE 4 
RELATIONSHIPS BETWEEN HIGH AND LOw TOTAL 
ScoRES ON THE PrimARY MENTAL ABILITIES 
AND NON-LANGUAGE TESTS AND 
TEACHERS’ RATINGS OF MAORI 
CHILDREN* 


(Median test) 








Total Primary Mental Abilities Scores os. Ratings by Age Group 





}11.0-11.11 |12.0-12.11 |13.0-13.11 |14.0-14.11 
(N = 153)|(N = 160)|(NV = 191)|(N = 193) 


— SSS 


Rated variable 


5.094 12.953°° 
7.275° | 3.254 


School attendance | 13.422** | 17.031*°* 
Continuity of educa- 5.204 | 2.912 
tion | | | 
Verbal fluency | §8.095° 15.108°* | 23.741°* 16.775°° 
Acculturation | 11.461°* | 16.686°* | 11.452** | 22.425°° 
School progress | 30.451°* | 16.326°* | 16.670°* 27.547°* 





Total Non-Language Test Scores vs. Ratings by Age Group 





}11.0-11.11 |12.0-12.11 
|(N = 153)|(N = 160)|(N = 191) 


13.0-13.11| 14.0-14.11 
(N = 193) 


eee We. 
School attendance / 15.721°* | 
Continuity of educa- | 2.945 | .S83 


| 


tion } 
} 


5.735 
- 808 1.058 


5.021 


9.707* | 19.004°* 
19.770°* 21.921°* 
| 31.138°* 
! 


Verbal fluency | 10.211%* | 12.796°° | 
Acculturation | 17.202** | 10.584°° | 
School progress 25.900°* | 16.980°* | 27.680°* 





* Significant at the .05 level. 
** Significant at the .01 level. 
* Chi-square values (df = 2, for each age group). 


TABLE 5 


RELATIONSHIPS BETWEEN HIGH AND Low SvuBrTEst 
ScorEs AND TEACHERS’ RATINGS OF 
ACCULTURATION OF MAorrI CHILDREN* 


(Median test) 








0-11.41 |12.0-12.11 |13.0-13.11 |14.0-14.11 
Subtest N = 153)|(N = 160)|(N = 191)|(W = 193) 





13.204°° 
2.020 
9.324°° 
8.348° 
6.929° 
4.905 
3.294 

| 16.629°° 
5.965 
1.326 
5.682 


14. 760°* 
4.584 
1.539 
0.512 
11.961°* 
1.105 
0.638 
4.896 
0.942 
3.551 
3.140 


13.917°* 
3.564 
14.648°* 
4.925 
11. 106** 
30. 360°* 
11.682°* 
1.860 
14.010°* 
8. 503° 
4.844 


| 10.743°° | 
19.330°* 
6.191° 
8.442° 
9.114* 
2.745 
4.231 
10.475°* 
5.100 
3.055 
0.699 


* Significant at the .05 level. 
** Significant at the .01 level. 
* Chi-square values (df = 2 for each age group). 


Verbal meaning 
Space 
Reasoning 
Number 

Word Fluency 
Substitutions 
Pict. Comp. 
Pict. Arr. 
Block Counting 
Differences 
Series 











vidually to be the better predictors of educa- 
tional progress, the total scores on the PMA 
and NLT show a very similar and only mod- 
erate degree of association with the progress 
ratings. (Contingency coefficients derived from 
the total chi squares are 34 and .36 for the 
PMA and NLT, respectively.) 
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TABLE 6 


RELATIONSHIPS BETWEEN HiGH AND Low SvuBTEST 
ScorRES AND TEACHERS’ RATINGS OF 
EDUCATIONAL PROGRESS OF MAORI 

CHILDREN* 








/11.0-11.11 12.0-12.11 |13.0-13.11 |14.0-14.11 
|(N = 153)|\(N = 160)|(N = dae bed = 193) 


| 


| | 
142°* | 29.698°* | 22.957°* 16.352°* 


8.452° 5.902 5.382 

9.991°* | 14.453°* | 21.164°* 
6.004° | 11.258°° | 18.458°° 
18.350°° | 5.120 | 14.528°° 
9.600°* | 2.144 | 26.709°* 
7.428° 8. 100° 9.340°* 
3.637 13.877°* | 5.213 

8.341° 5.775 | 12.187°° 
8. 169° 3.299 9.867°* 
2.881 | 12.139°* | 12.252°* 


Subtest 





Verbal Meaning 
Space .709 
Reasoning | 18.313** 
Number .560°* 
Word Fluency .399°* 
Substitutions .772°* 
Pict. Comp. . 596" 
Pict. Arr. . 893° 
Block Counting 246° 
Differences or 
Series 2.085 | 








* Significant at the .05 level. 
** Significant at the .01 level. 
* Chi-square value (df = 2 for each age group). 


Total scores on both tests were significantly 
related to school attendance and conversational 
verbal fluency. Continuity of education, how- 
ever, appears to be a less potent influence on 
test scores; the relationship between this 
variable and total NLT scores was, in fact, 
negligible. 

DISCUSSION 


It is evident that the Maori groups differ too 
greatly among themselves to be considered a 
single population for comparison purposes in a 
cross-cultural study of test performance. The 
significant differences between the means and 
between the variances of the three Maori 
groups are evidence to this effect. The findings 
constitute further evidence that cross-cultural 
studies of intelligence test performance cannot 
yield worthwhile conclusions unless socio- 
economic and educational differences within 
the different cultural groups are taken into 
account. They show, too, the futility of draw- 
ing general conclusions about the ability of a 
cultural group from studies of samples taken 
from a single locality, a practice which has been 
followed in some prior studies of Maori intelli- 
gence. In this connection, it should also be 
borne in mind that the control group in the 
present study was drawn largely from an urban 
area and so was probably slightly above the 
average for the New Zealand population. _ 

The superiority of the country Maori on the 
PMA may arise from the fact that pupils in 
the Maori schools are provided with an educa- 
tional program which is designed to meet their 
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needs and to allow for handicaps such as poor 
command of English. The educational ad- 
vantage of being taught in a Maori school may 
more than counterbalance the more intense 
contact with European culture experienced by 
the city Maori. This is especially likely to be 
the case when the test in question is for the 
most part closely related to basic school skills. 

On the other hand, it is possible that a 
process of selective migration may be operat- 
ing in the opposite direction from that which 
has been presumed to occur among Europeans 
and Americans. For economic reasons, the 
Maori is sometimes forced to seek work in the 
city where the majority of Maori are unskilled 
laborers. Perhaps it is the more skillful and 
alert Maori, who is more ready to learn from 
experience, who succeeds in farming success- 
fully or holding a steady position of responsi- 
bility in his own community. 

In addition, Maori children in city areas 
frequently enjoy a less stable home and com- 
munity life than those living in country areas; 
there is thus more likelihood of maladjustment 
and emotional difficulties which may impede 
educational progress. 

Both of the two last-mentioned factors, 
selective migration and emotional instability, 
may contribute to the very poor test per- 
formance of the market-garden Maori. It is in 
the areas in which he lives that the Maori 
meets with the maximum of discrimination; 
conditions of housing are poor and standards of 
living very depressed. The work is largely 
seasonal and is likely to attract mainly those 
Maori who cannot find better-paying and more 
stable employment elsewhere. 

No study of selective migration has, to the 
writer’s knowledge, been carried out in New 
Zealand. The selective migration hypothesis, as 
it has been applied to explain differences in test 
performance between northern and southern 
Negroes, has been severely criticized by Kline- 
berg (1). In view of the fact that the findings 
about the Maori are the reverse of those 
generally found for colored subjects in the 
United States, New Zealand would seem to be 
a good ground for testing out once more the 
selective migration hypothesis by experi- 
mental investigations. A large number of 
Maori families return to the country after a 
period of life in the city; many others move 
into the city from country areas. What, we 
might ask, happens to the intelligence test per- 
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formance of Maori children when the families 
have made such a move? For the present, how- 
ever, it is probably safer to look for an ex- 
planation of the better performance of the 
country children in terms of educational, social, 
and emotional adjustment. 

The interpretation of scores for separate 
factors of the PMA must be guarded in view 
of the fact that we are dealing with a group 
that is culturally different from the group on 
which the factors were isolated. Assuming, 
however, that the test measures approximately 
the same set of abilities for Maori and control 
alike, we may say that while the control group 
tends to be consistently superior in respect to 
the Factors V, S, and R, scores for the Factors 
N and W sometimes favor the Maori, particu- 
larly the country group. The finding for N is 
surprising; while the high scores of the country 
Maori could reflect to some degree the em- 
phasis placed on basic procedures in Maori 
schools, the fact that the city group was not 
significantly inferior to the control group at 
any age level suggests that this is an ability in 
which the Maori is potentially strong. The 
comparatively low Maori score for V and 
comparatively high score for W seem to bear 
out a frequently made observation that Maori 
students tend to be verbally productive and 
fluent but lack preciseness of expression. The 
discrepancy undoubtedly reflects the same 
kind of difficulty that can be seen in the 
foreigner who has acquired a considerable 
English vocabulary, but whom the nuances of 
the language still escape. 

The pattern of Maori strengths and weak- 
nesses on the PMA may also afford a clue to 
the fact that Maori children tend to make 
relatively poor school progress. They do com- 
paratively poorly on both V and R; yet these 
appear to be more highly related to school 
progress than any other subtests, both accord- 
ing to Thurstone and Thurstone (3) and on the 
evidence of the present study. Therefore, al- 
though the mean IQ for the country Maori 
group is generally above 100 and the mean IQ 
for the city Maori is generally little below, the 
subtest patterning suggests that the Maori is 
likely to perform in school at a level below that 
which would be expected from his total IQ on 
the PMA test. The existence of such predictive 
patterns on other tests and among other groups 
of Ss may perhaps have received too little at- 
tention; psychologists are so concerned with 
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detecting potential for better performance 
than would be indicated by total test scores 
that they rarely concern themselves with pat- 
terns which might lead one to predict poorer 
school performance than would be expected 
from the total score or IQ alone. 

The results throw a great deal of doubt on 
the general efficacy of nonverbal tests for the 
assessment of the intelligence of children who 
are culturally handicapped. On the whole, re- 
sults on the NLT proved more unfavorable to 
the Maori, in comparison with the controls, 
than did those on the PMA. In addition, on 
the one PMA test which calls for the handling 
of perceptual relationships, i.e., S, they also 
did relatively poorly. The tendency for the 
more strictly verbal tests to be individually 
more highly related to acculturation than the 
remaining tests may be in part explained by 
the fact that acculturation was defined partly 
in terms of parental command of the English 
language. Thus, there may be other cultural 
differences operating which partly account for 
the generally poor showing of the Maori on the 
perceptual tests. In other words, the fact that 
the verbal tests are within the Maori group 
more highly related to the acculturation ratings 
may be due solely to the way in which ac- 
culturation was defined by the scale. 

If a test is to be used as a predictor of im- 
mediate educational progress, obviously even 
for the Maori children the traditional type of 
verbal test seems to be the more efficient; in 
fact, V appears to be about as good a predictor 
by itself as either of the test batteries as a 
whole. This is, of course, not surprising, for 
educational progress is obviously highly de- 
pendent upon command of the language of 
instruction, which in all New Zealand schools 
is English. 

The problem of getting at “intellectual po- 
tential” in the case of these Maori children 
must be considered in the light of their edu- 
cational background. Compulsory education 
starts at seven years of age in the Maori 
schools; consequently, all of our Ss had had at 
least four years of schooling with English as 
the language of instruction. One would expect 
the more alert and intelligent child to gain 
more from exposure to educational influences 
than the less intelligent one. Thus, although 
on account of bilingualism and inaccuracy in 
the use of English in their everyday environ- 
ment they may not have had the opportunity 
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to acquire precision in the use of language, 
their ability to execute a simple operation such 
as the addition of numbers with speed and 
accuracy and the size and availability of their 
vocabulary, as opposed to accuracy of usage, 
may provide as good indications of “‘potential” 
as their performance on perceptual tasks. 
Many of the culturally handicapped children 
who are examined in other countries are in the 
same situation as the Maori in New Zealand; 
they have been exposed for some time to edu- 
cational influences within the dominant cul- 
ture. Judging from the results of this study, 
tests such as N and W may be useful for as- 
sessing potential under these circumstances. 
They are probably also more useful predictors 
of immediate academic success than most per- 
ceptual tests; however, as we have pointed 
out above, they may lead to too high an ex- 
pectation if tests such as V and R do not pro- 
duce corresponding results. 

It should, finally, be emphasized that the 
varying performance of Maori children may 
not be accountable solely in terms of cultural 
factors. On the PMA, the country group was 
generally somewhat superior to the city group. 
Yet, when a comparison of the distributions of 
ratings of acculturation was made for these 
two groups, the city group proved to be sig- 
nificantly more acculturated (chi square = 
11.889; p < .01). Since most of the country 
group was taken from isolated communities, 
this is what we should expect. Moreover, the 
tests on which the Maori children were rela- 
tively the most successful, N, W, and Substitu- 
tions, appear to be more highly related to the 
acculturation variable than do others on 
which they do less well. Even allowing for the 
fact that our definition of acculturation was 
probably somewhat inadequate, the findings 
therefore suggest that we may be dealing with 
real strengths and weaknesses, which are 
somewhat independent of cultural factors as 
here defined, though they may perhaps be 
partly the outcome of exposure to particular 
educational methods. 


SUMMARY 


The SRA form of Thurstone’s Test of Pri- 
mary Mental Abilities and a specially com- 
piled nonverbal test battery were admin- 
istered to Maori children aged 11 through 15 
and to a control group of New Zealand chil- 
dren of European origin. The Maori children 
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were taken from three areas: (a) the city of 
Auckland and the town of Whangerei; (6) 
semi-rural, market-garden areas, bordering 
on Auckland; and (c) outlying country areas. 
Wherever possible, teachers’ ratings were ob- 
tained for Maori pupils for five variables which 
were expected to show some relationship to 
test scores. 

The Maori groups differ so greatly among 
themselves that they cannot be considered a 
single population for purposes of comparison 
with the control group. The differences be- 
tween the Maori groups appear to reflect the 
influence of educational, socioeconomic, and 
adjustment factors, though a selective migra- 
tion process of a reverse kind from that al- 
leged to occur among European groups may 
also be operating. 

On the PMA, the Maori appear to be rela- 
tively strong on the Factors N and W, and 
relatively weak on the Factors V, S, and R. 
This patterning of the subtests may give an 
indication of why Maori children tend in gen- 
eral to do less well in school than would be ex- 
pected on the basis of Maori ability as meas- 
ured by the total test score. 

Generally speaking, the Maori groups did 
less well in comparison with the control on the 
totally nonverbal test than they did on the 
PMA. This was consistent with their poor 
showing on Thurstone’s S Test, since the non- 
verbal battery relied heavily on the discovery 
or recognition of perceptual relationships. 
Some doubt is thrown on the efficacy of non- 


RIcHARD H. WALTERS 





verbal tests for the assessment of ability of 
culturally handicapped groups. 

Total scores on the PMA and nonverbal bat- 
tery appear to be about equally related to 
teachers’ ratings of school attendance, conver- 
sational verbal fluency, and acculturation. The 
PMA, but not the nonverbal battery, was also 
related to continuity of education. 

The PMA and nonverbal battery were about 
equally related to school progress when only 
total scores were considered. However, strictly 
verbal tests are individually generally the best 
predictors; V alone appears to predict success 
about as well as either of the test batteries 
taken as a whole. Factors N and W show some 
relationship to educational progress and do not 
generally seem to put the Maori at a disad- 
vantage but may lead one to expect better 
progress than the Maori child is, in fact, likely 
to make. While they may not be good indi- 
cators of “potential” for this culturally handi- 
capped group, V and R are probably better 
indicators of more immediate school progress. 
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CRITIQUE AND NOTES 


SOME SHORTCOMINGS IN PROJECTIVE TEST VALIDATION 
KENNETH PURCELL 
University of Kentucky 


rey meee have sometimes taken the posi- 
tion that projective tests have done their work 
when they reveal the inner, private world of the 
subject. Occasionally, this point of view makes for 
a situation in which neither the test data nor the 
interpretations of these data can ever be explicitly 
and publicly evaluated. Imagine a situation with 
which most clinical psychologists doing diagnostic 
work are familiar: The psychologist makes certain 
statements about S’s characteristics on the basis 
of projective test data. If, over a period of time, S 
shows no sign of manifesting such characteristics 
either in daily life or in psychotherapy, the diag- 
nostician has to offer some explanation. He has 
several alternatives. He may accuse his test data 
of being unreliable and limited as a sample of 
behavior. He may re-examine the theories he used 
in interpretation. Or he may even say that his data 
and inferences refer to a deeper level of the uncon- 
scious than either S’s overt behavior or his produc- 
tions in therapy. This last assertion can, of course, 
never be proved wrong, and the psychologist may 
go on merrily describing one unconscious after 
another, thoroughly insulated from observable 
fact. 

This little bit of caricature highlights the desir- 
ability of a validity criterion which includes be- 
havioral prediction. As a matter of fact, however, 
a number of investigations have utilized such a 
criterion with inconclusive or negative results 
being at least as frequent as positive findings. It 
is the main thesis of this paper that some of the 
sources of invalidity in projective techniques are 
avoidable and related to inadequate conceptual 
schemes. 

One of the serious deficiences in recent work is 
the tendency to ignore a significant aspect of the 
test data. Studies seeking to relate test perform- 
ance to overt behavior have commonly failed to 
consider signs of defensive, inhibitory tendencies. 
Yet one of the major concerns of clinical psychol- 
ogists is the study of conflict in human behavior. 
The kind of conflict that is usually of central 
importance in personality disturbance is that 
described by Dollard and Miller (2) as approach- 
avoidance conflict. Their analysis of approach- 
avoidance behavior in therapy may also be applied 
to what occurs in the projective test situation. 
Suppose an § has strong aggressive tendencies 
(approach) which are inhibited by associated 
anxiety (avoidance). In testing, as in therapy, an 
attempt is made to reduce the strength of avoid- 
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ance, thus permitting the expression of ordinarily 
forbidden impulses. The avoidance gradient is 
lowered in two main ways: (a) by the permissive, 
accepting dcmeanor of the tester, and (0) by the 
deceptive quality of the instructions and the nature 
of the tests, which tend to prevent S’s recognizing 
the connection between his response and the forbid- 
den impulse. 

Henry and Rotter (6) reported a study of situa- 
tional influences on Rorschach responses which 
amounted to a manipulation of these approach- 
avoidance tendencies. They administered the 
Rorschach to a control group, using the standard 
Klopfer instructions, but the experimental group 
was given the added information that the test has 
been used for detecting and studying serious emo- 
tional disturbance. The additional information 
should theoretically have the effect of keeping 
avoidance gradients relatively higher in the experi- 
mental Ss. Fewer responses indicating impulse 
expression or intense conflict should therefore 
occur in this group. According to Henry and 
Rotter, the experimental group gave fewer re- 
sponses, more good form level, more populars and 
animal responses, more form responses, and fewer 
aggressive responses. 

However, even under standard projective test 
instructions, avoidance signs are almost never 
eliminated from the test data. In fact, under 
certain circumstances, they may be enhanced. A 
more detailed examination of the Miller conflict 
model provides a basis for understanding this. Al- 
though the model was derived from studies of 
animals in spatial conflict situations, Miller has 
suggested that it may usefully be applied to non- 
spatial conflicts in humans. The concept of an 
approach and avoidance gradient has been used to 
describe the overt behavior of an animal with 
respect to an object simultaneously representing 
both a goal and a threat. These gradients may be 
represented schematically as sloping lines on a set 
of coordinates, the abscissa indicating units of 
distance from the object (origin of the coordinates) 
and the ordinate indicating the strength of the 
approach or avoidance tendency. The tendency to 
approach or to avoid is maximal in the presence of 
the object and decreases with distance. The fear 
gradient is steeper than the goal gradient, which 
means that as the organism comes closer to the 
goal, its tendency to run away from it increases 
more rapidly than does his tendency to approach 
it. Finally, the strength of the two gradients 
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depends upon drive level. An increase in drive 
raises the height of the entire gradient. Differences 
in heights of the gradients determine the point at 
which the two slopes cross (if at all) and the range 
on the abscissa characterized by approach and 
avoidance behavior, respectively. 

In the light of this model, how may defensive 
fear indications actually be increased in perform- 
ance under projective testing? If avoidance is 
reduced, but not enough to permit the approach all 
the way to the goal (the verbal representation of 
an impulse, motive, or habit tendency), then the 
impulse will be nearer expression. But the point of 
intersection of approach-avoidance gradients will 
be at a higher level than before. Hence, more 
intense signs of conflict and ambivalence should 
be evident in S’s test responses. Simlarly, the 
Rorschach test, because it is unusually indirect, 
probably lowers the avoidance gradient more than, 
say, the TAT. Therefore, the net approach tend- 
ency is likely to be stronger for Rorschach than 
for TAT responses. Rorschach approach images 
should be more directly represented. At the same 
time, more fear and conflict ought to be elicited 
because S is nearer the goal, and the strength of 
both tendencies is increased. As Miller points out, 
these deductions will hold true only for the range 
within which the approach-avoidance gradients 
intersect. 

On theoretical grounds, then, one expects an 
abundance of projective test data expressive, of 
fears and inhibitions. Those clinicians who regu- 
larly deal with projective test protocols can readily 
testify to this not very remarkable deduction. 
Nevertheless, a significant portion of the validity 
studies concerned with behavioral correlates of 
projective performance seem to ignore avoidance 
behaviors as criteria. Smith and Coleman (13), for 
example, predicted a curvilinear relationship be- 
tween hostile content in the Rorschach and overt 
behavior. Their hypothesis was based on the 
hydraulic conception of an inverse relationship 
between overt behavioral discharge and fantasy. 
A low but statistically significant correlation was 
found between ratings of overt hostile behavior 
and Rorschach content. Ss high in Rorschach 
hostility scores expressed little overt aggression in 
a classroom. The authors suggested that “Ss evi- 
dencing low hostile content scores could theoreti- 
cally have been Ss who discharged tension imme- 
diately and regularly or who tended to be 
impunitive and to show little evidence of hostility 
arousal in their everyday behavior.” They chose 
the latter possibility because their Ss low in 
Rorschach hostile content actually revealed little 
overt hostility in the classroom. However, evalua- 
tion of the relative strengths of approach- 
avoidance tendencies along the relevant hostility 
dimension would permit more precise prediction. 


CRITIQUE AND NOTES 


For one thing, it would not be necessary to enter- 
tain alternate hypotheses for the group low in 
Rorschach hostile content and then to select the 
correct hypothesis on a post hoc basis. A single 
prediction could be made for every individual, 
based on approach-avoidance ratios. 

Finney (4) tested the hypothesis that response 
content in the Rorschach can be related to assaul- 
tive behavior. He found the relationship between 
a Destructive Content scale and assaultive be- 
havior ratings to fall just short of the 5 per cent 
level of significance in spite of N = 117. Here, 
again, there was an absence of any attempt to 
conceptualize the avoidance gradient variables and 
to make a behavioral prediction in terms of an 
impulse-control balance. As a matter of fact, there 
seems to have been a lumping together of approach 
and avoidance content within the same scoring 
scale. The Possibly Destructive category, for 
instance, included responses in which the “‘concept 
is more likely than not to attack, injure, or destroy 
something” as well as responses in which the 
concept was considered “frightening or dan- 
gerous.” 

Similarly, Stone (14), in constructing a TAT 
Aggressive Content scale, indicated that one of his 
primary aims was to relate it to overt behavior. Yet 
nowhere in his scoring system does he appear to 
make any provision for the defensive avoidance 
forces which are of such importance in modifying 
or inhibiting the behavioral resultant of im- 
pulses. 

In concluding that amount of hostility dis- 
played by a patient in the TAT is not by itself a 
reliable index for predicting overt hostile behavior 
under stress, Gluck (5) noted the desirability of 
considering other factors, such as anxiety asso- 
ciated with hostility and direction of hostility. 
Studies which have attempted to take these vari- 
ables into account (11, 12) seem to have been 
more successful in relating fantasy material to 
overt behavior than those mentioned above. 

Research by McClelland and his associates (10) 
has led to a distinction between two aspects of 
achievement motivation, hope of success and fear 
of failure. As Clark and others (1) have pointed 
out, this amounts to a distinction between an ap- 
proach motive and an avoidance motive. Clark 
found that, with regard to a level of aspiration 
measure, students at the extremes of the aspiration 
continuum have lower n-Achievement scores than 
students in the middle of the continuum. How- 
ever, when n-Achievement scores were separated 
into positive goal imagery and deprivation imagery 
categories, it became clear that the n-Achievement 
score of the middle group was overwhelmingly a 
function of deprivation imagery. 

Thus far, the discussion of the Miller conflict 
model has been confined to those circumstances 
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in which there is assumed intersection of approach- 
avoidance gradients prior to reaching the goal. 
Where the gradient of avoidance is so weak that it 
no longer intersects the gradient of approach, a 
situation exists which may have relevance to 
another problem. This has to do with the rather 
widely held hypothesis that fantasy has a substitu- 
tional or compensatory relationship to overt 
behavior. Although commonly held, this view has 
been questioned by some investigators. McClelland 
et al. have been among the doubters on the basis of 
inferences from TAT fantasy data. Purcell (12) 
suggested that certain individuals may actually 
be stimulated to overt action by engaging in 
fantasy. This suggestion was made because of the 
finding, contrary to the fantasy substitution 
hypothesis, that antisocial individuals produced 
more aggressive fantasy than nonantisocial neu- 
rotics. Antisocial persons generally have very weak 
avoidance tendencies with respect to aggression. It 
may be that under circumstances in which the 
approach tendency is so strong relative to avoid- 
ance that no intersection occurs, fantasy tends to 
become a preparatory stimulus to action rather 
than a drive-reducing substitute. The reason for 
this hypothesized differential role of fantasy may 
be that persons high in approach and low in avoid- 
ance have not been faced with the need for learning 


to utilize fantasy in a substitutive capacity because 
behavioral discharge was readily accessible. 
Feshbach (3) found that an insulted group of 
students who had an opportunity to express aggres- 
sion in fantasy displayed significantly less aggres- 


sion toward the experimenter who provoked 
hostility than did the control groups who engaged 
in nonfantasy activities. Presumably these stu- 
dents were, in the main, individuals with fairly 
well developed inhibitions (high avoidance gradi- 
ents) in the aggressive sphere. Perhaps the same 
experiment repeated with antisocial individuals 
would yield quite different results. Or, were the Ss 
divided into two groups, one having relatively 
high and the other relatively low approach-avoid- 
ance ratios, fantasy might be found to function 
more effectively as a drive-reducing agent for the 
latter. 

A basic question, of course, is that of how to 
measure the strength of approach-avoidance 
tendencies in projective test responses. Numerous 
scoring schemes have been proposed for the TAT 
and Rorschach. In selecting measurement indices 
for the two tendencies, it may be well to keep in 
mind such established indices of habit strength as 
probability of response evocation, latency, and 
amplitude or intensity (7). These suggest corre- 
sponding measures in the test data. Two common 
measures might be used to reflect probability of 
response evocation: frequency of a thematic or 
imagery type in a given protocol and the relative 
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novelty or idiosyncratic quality of an image or 
theme in terms of normative data. Latency might 
be evaluated by the initial appearance of a type of 
theme with reference to time and to position in the 
response sequence. Scaling of amplitude or inten- 
sity would at first require a more qualitative judg- 
ment regarding the content form in which a given 
habit tendency or impulse is couched. It is evident 
that similar content may be produced by two 
persons in ways that suggest important variations 
in the kind of control exercised over the impulse 
represented. 

Another important conceptual omission in this 
field has to do with the notion of generalization. 
Behavioral prediction is likely to be accurate to 
the extent that the variables present in the test 
situation are similar to those in the prediction 
situation. This may include not only the char- 
acteristics peculiar to the tester—subject relation- 
ship—but also the actual content of the test 
stimuli, e.g., the kind of figures shown in the TAT 
pictures. For example, an experimental test valida- 
tion that required the prediction of reaction to 
frustration among adults would probably be 
doomed to mediocre results at best without a more 
precise definition of the problem. A person’s be- 
havioral reaction to frustration is probably in 
large part a function of the frustrating agent and 
the situational context. For some individuals, 
depending upon whether the frustration was in- 
duced by a child, a peer, or a superior, the reaction 
may be one of overt irritation and anger, sullen 
withdrawal, or an outwardly compliant, submis- 
sive response. It is not enough simply to seek to 
relate projective data to, say, hostile behavior. 
The ideal to be pursued is an exact statement of 
the nature of the hostile behavior criterion—hostile 
to what degree, how expressed, toward whom, and 
under what circumstances. Then, for specific 
behavioral predictions, stimulus materials can be 
meaningfully selected or constructed so as to have 
them more closely resemble the predictive situation 
which constitutes the criterion. 

Kagan (8) illustrated this point of view in an 
experiment demonstrating that frequency of 
fighting themes in TAT-like pictures is more 
directly related to overt fighting behavior than to 
other types of fantasy aggression. He constructed 
13 pictures especially and specifically designed to 
elicit relevant responses. On the other hand, Gluck 
(5), in his study of provocation of hostile behavior 
under stress, seemed to overlook completely the 
implications of a generalization gradient between 
the test and predictive situations. The stress in this 
experiment was a “hostile, domineering authority 
figure.”” Yet the TAT cards used were 1, 11, 10, 
8GF, 13MF, and 18GF. Pictures which might 
have elicited the most meaningful information on 
the basis of situational similarity, e.g., 7BM, 12M, 
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and 8BM (cards in which male authority figures 
are commonly projected), were not used. Perhaps 
also pertinent here is one result of the Smith and 
Coleman (13) investigation already cited. They 
reported that the only Rorschach content relation- 
ship significant at the .01 level was that between 
Rorschach hostile content and physical hostility. 
It would be of interest to know how much of the 
content involved percepts of physical hostilities 
and whether an analysis similar to Kagan’s would 
yield a similar result. 

The results of this brief review and analysis of 
some of the literature strongly support Korner’s 
(9) advice to distinguish the real from the avoid- 
able sources of invalidity in the field of projective 
techniques by a more careful theoretical analysis. 


SUMMARY 


This paper has sought to demonstrate that the 
negative or inconclusive results obtained in many 
validity studies of projective techniques are, in part 
at least, determined by important defects in the 
experimental conceptualization of the problem. 
The major deficiencies noted were (a) a tendency 
to ignore the significant avoidant aspects of be- 
havior as reflected in the test data, and (d) a 
failure to recognize the importance of the general- 
ization gradient from test to predictive situation 
as a relevant variable affecting predictive accuracy. 
An hypothesis was offered regarding a differential 
role of fantasy in relation to overt behavior. 
Finally, certain concepts from general behavior 
theory, particularly the Miller conflict model, were 
used in formulating the ideas expressed. 


CRITIQUE AND 
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SOME EFFECTS OF INVOLVEMENT UPON EVALUATION! 


HAROLD B. GERARD 
Bell Telephone Laboratories, Murray Hill, N. J. 


HE general hypothesis being investigated 
here concerns the relationship between in- 
volvement in a group activity as it affects the 
evaluation of the group members’ contribution; 
the greater the involvement, the greater are the 


1 The writer wishes to thank Daniel Wilner for his 
help in running the experiment. Thanks are also due 
Ruby Weinberg for help with the analysis and to Mor- 
ton Deutsch for suggestions and criticism. This study 
which was conducted at the Research Center for Hu- 
man Relations, New York University, was spon- 
sored by the Office of Naval Research, Contract NONR 
285 (10). The United States Government is authorized 
to reprint this article in whole or in part. 


predicted effects upon evaluation. Involvement 
was varied by superimposing two experimental! 
manipulations. Access to the major activity of the 
group was varied and so was the degree to which 
the activity had potential motivational conse- 
quences. 


METHOD 


Subjects. The 36 individuals who comprised nine 
four-person experimental groups were students in 
investigator’s introductory psychology class at New 
York University. Each of the experimental groups was 
composed either entirely of men or entirely of women. 

The experimental setting. A few days before the ex- 








CRITIQUE AND NOTES 


periment, the class members were asked to be prepared 
to discuss the chapter in their text on inteliigence tests. 
On the day of the experiment the class was divided into 
groups of four persons each. They were told to discuss 
the material within the framework of several general 
questions concerning intelligence tests with the aim of 
formulating some specific questions which they would 
like to put before the entire class at the next class meet- 
ing. Individuals were assigned to the groups so as to 
minimize the number of acquaintances. 

By a random process, one individual was assigned 
to the role of recorder. The recorder was told that he 
could not participate in the discussion but should take 
notes on what was said. The other three members were 
the active individuals whom we shall henceforth call the 
discussants. A group discussion ensued which lasted for 
thirty minutes. 

The experimental conditions. Accessibility to activi- 
ties should produce high involvement only if the activi- 
ties are seen as motivationally relevant, i.e., if engaging 
in the activity has consequences for the individual. The 
experiment was designed to produce two degrees of 
motivational relevance. Ss in four of the groups were 
told that the discussions would be carefully observed by 
the experimenter and the quality of their performance 
would in part determine the grade they would receive 
in the course. Ss in the other five groups were not given 
these instructions. In this latter condition of low moti- 
vational relevance, involvement is allowed to vary as 
a function of numerous chance factors operating in the 
situation, whereas in the high condition a systematic 
heightening of involvement is expected. The treat- 
ments were run in adjoining rooms. 


The questionnaire. After the discussion, each S 
ranked the group members concerning certain 
performance criteria: 

1. Who knows the most. 

2. Who explained things the best. 

3. Who contributed the most. 

They were also asked to predict the rankings 
made by each of the other participants on all 
three criteria. 


RESULTS 


The rank order correlation coefficients between 
the orderings S predicted each of the others 
would make and the orderings each actually 
made were computed. Since the recorder did not 
participate in the discussion, his rank placement 
in all of the orderings was eliminated. His rank- 
ings of the others, however, were included. The 
average rho for the predictions each S made of the 
others’ rank orderings on the three performance 
criteria was then computed. Thus, there were 
nine rhos for each S, one for each of the rank 
orderings made by the three others in his group 
on each of the three aspects of group process 
which were then averaged? to obtain a measure of 
his accuracy of predicting the others’ evaluations. 


2 These averages are based upon z-score conversions. 
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TABLE 1 


MEAN ACCURACY OF PREDICTION BY THE RECORDER 
AND DISCUSSANTS UNDER HicH AND Low 
MOTIVATIONAL RELEVANCE 








Discus- 


Condition Recorder sant 





High motivational relevance .79 .40 


Low motivational relevance my an 





TAPLE 2 


AVERAGE RANK* FOR DISCUSSANTS ON THE 
THREE PERFORMANCE CRITERIA 








Low Motivational 
Relevance 


Ranked Ex- 
Self by ed Self by pected 
Rank Others Rank Others Rank 


2.29 2.40 
1.80 2.19 
2.20 2.31 


High Motivational 
Relevance 





Criterion 





2.53 
4.07 
2.29 


2.19 
1.83 
1.98 


2.46 
2.08 
2.34 


Knowledge 2.00 
Contribution 1.75 
Explanation 1.92 





*® The lower the number the higher the rank. 


In Table 1, these indices are presented sepa- 
rately for the recorder and the discussants under 
high and low motivational relevance. The re- 
corder-discussant difference under high motiva- 
tional relevance is quite striking (6 <.003 by the 
sign test),> whereas no such difference appears 
under low personal relevance. A trend is also 
indicated between discussants under high and 
low motivational relevance (p <.15 by 2). 

How can we best account for the relative in- 
ability of the discussants under high motivational 
relevance to predict the rank orderings correctly? 
The figures in Table 2 indicate the degree to which 
discussants overestimated their productive value 
to the group.‘ Since the data for the recorder are 
based on too few cases to be statistically reliable, 
they are not presented. The self-rating figures 
indicate the average rank placement an individ- 
ual gave to himself. The ratings of others indicate 
the average rank placement given him by the 
other group members. The figures show a statis- 
tically significant tendency (g <.05 by #) for 


* It seemed advisable to test the difference between 
recorder and discussant accuracy by means of the sign 
test. The test involved comparing, in each experimental 
group, the accuracy of prediction of the recorder with 
each of the three discussants. 

* Since the recorder was not included in computing 
the average rhos presented in Table 1, data for the re- 
corder are omitted in Table 2. It is somewhat meaning- 
less to expect the recorder to evaluate his own perform- 
ance or to have any expectations at all concerning how 
others had evaluated him since he did not participate. 
During the questionnaire administration, Ss encoun- 
tered considerable difficulty in this respect. The ques- 
tionnaire should have been constructed so as to elimi- 
nate evaluations of the recorder’s performance. 
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discussants under high motivational relevance to 
over-estimate their value to the group, whereas 
no statistically significant difference exists under 
low relevance. These ¢ tests were performed by 
comparing, for each S, his mean self placement on 
the three criteria with the mean placement of him 
by the others. 

In order to determine whether or not motiva- 
tional relevance in this situation affected how an 
individual thought the others in the group valued 
him, the average rank the individual expected 
others to assign him was computed and compared 
with the rank they actually gave him. The data 
on expected rank are presented in the third 
column of Table 2. Once again overestimation 
occurs only under high motivational relevance 
(p <.05 by 2). 

The discussants’ failure under conditions of 
high motivational relevance to predict the rela- 
tive evaluations placed upon the membership by 
others (Table 1) can therefore perhaps be at- 
tributed to the fact that, as a result of their desire 
to do well, they tended to overestimate their 
value to the group. When an individual has a 
personal stake, as it were, in the quality of his 
own performance, his judgments of his perform- 
ance tend to be equilibrated with his expectations. 


CRITIQUE AND NOTES 


It is as though the judgmental process itself 
serves, in part at least, to realize one’s desires. 


SUMMARY 


Four-person discussion groups were created in a 
classroom in which three members were active 
participants and one member functioned in the 
peripheral role of recorder. In four of the groups, 
the discussion occurred under high motivational 
relevance (a personal stake was introduced) 
whereas in five of the groups the discussion oc- 
curred under low motivational relevance (lack of 
an explicit personal stake). 

After the discussion, each S was asked to rank 
order the membership of his group concerning a 
number of aspects of group process and to guess 
the rankings made by each of the others in his 
group. The data revealed that the recorder-dis- 
cussant difference in accuracy of prediction oc- 
curred only under high personal relevance. 
Further analysis was undertaken which demon- 
strated that this differential accuracy can be ac- 
counted for by the tendency for discussants under 
high motivational relevance to over-value their 
own performance. 


Received May 14, 1957. 


TASK DIFFICULTY AND CONFORMITY PRESSURES' 


JANET FAGAN COLEMAN, ROBERT R. BLAKE, anp JANE SRYGLEY MOUTON 


University of Texas 


Ac: conformity pressures more easily exerted 
when Ss are less adequately informed re- 
garding the correct response? A direct relation- 
ship seems self-evident. Yet, when judgments 
were made easy or difficult by changing rate of 
clicking for a metronome counting task, con- 
formity to social pressures was found to be un- 
related to “difficulty” (5). Where the objectively 
correct judgments were rendered “easy” or 
“hard” by varying the relative lengths of the 
standard and comparison lines, the relationship 
was not confirmed in a study with adults (1) 
though it was with children (2). On the other 
hand, susceptibility has been reported as a func- 
tion of difficulty for arithmetic problems (5, 6). 
While susceptibility may vary directly with diffi- 
culty, results from experiments reported do not 
consistently demonstrate the validity of the re- 
lationship. However, none of the tasks described 
above are subjected to conformity pressures 
under ordinary social conditions. 


1 The study reported here was made possible by a 
research grant from The Hogg Foundation for Mental 
Hygiene, The University of Texas, Austin, Texas. 


The following experiment evaluates the rela- 
tionship with respect to kinds of information 
encountered in social situations of everyday life. 
The materials judged included questions about 
current events, geography, government, litera- 
ture, language, and science. Since an individual 
is expected to be informed about such questions 
of information, the existence of a relationship 
between susceptibility and difficulty should be 
most easily demonstrated for these kinds of ma- 
terials. 


METHOD 


The experiment was presented to each of the sixty 
Ss as a comparative study ot college students today 
with those twenty years ago in regard to ability to deal 
with questions involving general information. 


Description of the Test Situation 


Each participant understood that to save time he 
was one of four who would be working together. Simu- 
lated group procedure was used to create a standard 
influence situation (3, 4). Fifteen of the men and fifteen 
of the women responded to each item after hearing the 
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TABLE 1 


Item CONTENT, DirFicuLty LEVEL, AND Per CENT oF Ss CONFORMING TO THE INCORRECT ANSWER 


Item 


Knight is the name of the Governor of: 
Iowa, Indiana, California*», M aine*® 
The smallest state in the union is: 
Connecticut, Delaware, New Jersey, Rhode Island* 
Staunch means the same as or the opposite of: 
Easy, Unwavering*, Uneven, Stupid 
The chief justice of the Supreme Court is: 
Holmes, Burton, Warren*, Clark 

5 Stalwart means the same as or the opposite of: 
Tall, Weak*, Wild, Agitated 
Topsy was: 
Arabian, Eskimo, Jndian, Negro* 

7 The opposite of militarism is: 
Nationalism, Nihilism, Pacifism*, Communism 
Morose means the same as or the opposite of: 
Urgent, Docile, Gloomy*, Thorough 
Avid means the same as or the opposite of: 
Eager*, Vivid, Arid, Volatile 
A ward is: 








Per Cent 
Conforming 
Wo- 
Men men 


Per Cent 
Difficulty* 


Wo- 
men 





Men 





10 50 33 60 

08 06 27 20 
13 27 37 
30 20 27 
36 43 40 
14 40 33 
30 37 56 
10 
33 


A Public Park, a Political Division of a City*, A Seat of County Government, 


The Business District of a City 


A gene is a structure believed to be responsible for: 


40 


Resistance to Infection, Inheritance of A Trait*, The Formation of Gameltes, 


The Development of Scurvy 
12 Irksome means the same as or the opposite of: 
Mortal, Possible, Tiresome*, nferior 





* Ten per cent difficulty means that 10% of the standardizing group gave an incorrect response. 


> The correct answer for each item is indicated by an asterisk (*). 


© The response given by the three others prior to S’s response is italicized in each case. 


reports of three men. The other fifteen men and fifteen 
women responded after three women. 


Task 


The task involved twelve items embedded among a 
total of 35 questions with instructions to choose the 
correct answer from four alternatives given. The first 
three individuals who responded gave an incorrect 
answer to the 12 items as shown in Table 1. They gave 
both correct and incorrect responses for the 23 non- 
critical items. An answer was scored as a conforming 
response if it was the same as that by the three persons 
responding before the S. 


Selection of Items 


Level of difficulty was based on the reactions to the 
items by 50 men and 50 women answering under pri- 
vate conditions. Items were selected, for another part 
of the study, so that some were better known by men 
than women and vice versa. Judged in terms of percent- 
age of the standardizing group giving the correct 
answer, difficulty raaged from zero to 54 per cent. 
Since the standardizing group and experimental 
groups are comparable in terms of age, academic level 
and major, more frequent yielding under experimental 
conditions on difficult than on easy items would estab- 
lish that conformity pressures are more easily exerted 
when the S is less well informed regarding the correct 
answer. 


RESULTS AND DISCUSSION 


The frequency of conformity is shown in Table 
1, for men and women. The relationship between 
the difficulty of the items and frequency of con- 
formity is evaluated by rank-difference correla- 
tions. The correlation for men is .58, significant 
beyond the .05 level, and for women, .89, signifi- 
cant beyond the .01 level. Although the sex of 
the others was varied systematically, there was no 
interaction between sex and level of difficulty. 

For both men and women, therefore, conformity 
pressures are more easily exerted when Ss are 
least well informed and the task involves items 
of general information. The results are inter- 
preted as indicating that if an individual is certain 
of the correct answer, he is more able to resist 
pressures being exerted by being more able to 
respond in terms of internal cues. Supplemental 
external information as the basis of his response 
is more frequently employed when the person is 
unfamiliar with the correct answer. A direct im- 
plication of these findings is that susceptibility 
to conformity pressures can be decreased by in- 
creasing an individual’s ability to make a com- 
petent selection of responses. 
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“FAULTY” COMMUNICATION AND THE SPIRAL AFTEREFFECT: 
A METHODOLOGICAL CRITIQUE! 


EUGENE S. GOLLIN 
Queens College 
anp NORMAN BRADFORD 


University of Minnesota 


woen to report aftereffect following fixation 


of a rotating Archimedes spiral has recently 
received considerable attention by investigators 
interested in utilizing this failure as a diagnostic 
indicator of brain damage (e.g., 1, 3). The same 
technique has been utilized in a developmental 
study by Harding, Glassman, and Helz (2). In this 
study, no child under CA 55 months or under MA 
60 months achieved the success criterion of three 
out of four trials. These data are interpreted as a 
possible reflection of neurological immaturity, and 
a functional similarity between children and adults 
with organic cerebral damage is alleged. Recogniz- 
ing that their results may not represent a percep- 
tual failure but rather a failure of report, Harding 
and his associates remain convinced of the validity 
of the verbal responses. Since the presence or ab- 
sence of perception of the spiral aftereffect is im- 
plicitly or explicitly related to structural and 
functional characteristics of the cerebral cortex by 
almost all investigators who have employed the spi- 
ral technique, the validity of verbal report is a mat- 
ter of some consequence. Some Ss who fail to meet 
the criterion of success, especially young children 
and brain-damaged adults, may, in view of the 
nature of the aftereffect, be unable to supply the 
verbal designators necessary for the achievement 
of a positive score. 
The aftereffect, depending upon the direction of 


1 This research was completed while the senior au- 
thor was at the University of Minnesota. It was sup- 
ported by a grant from the Graduate School of the 


University. 


rotation of the spiral, consists of either a phenome- 
nally receding and/or contracting visual field, or an 
approaching and/or expanding field. The usual 
method of obtaining the aftereffect report is to 
have S fixate a rotating spiral (rotation speed 
varies among investigators, but 78 RPM is 
fairly standard) for approximately thirty seconds 
and then report his visual impression when the 
rotation is abruptly terminated. If during the 
rotation period the spiral appears to be expanding 
or approaching, upon cessation of rotation the 
phenomenal field appears to be receding or con- 
tracting and vice versa. 

The aftereffect is not confined to the spiral per se. 
Following fixation of the rotating spiral for a 
sufficient time, virtually any object in the surround 
may be employed for the elicitation of the verbal 
report. This fact widens the methods by which the 
phenomenon may be investigated. 

No attempt has been made, so far as is known, 
to determine the verbal designators which Ss 
employ when confronted with stimulus situations 
which involve actual rather than illusory expansion 
and approach, and actual contraction and reces- 
sion. If such designators were available before 
Ss were tested in the aftereffect situation, the 
value of verbal report as a reflection of phenomenal 
experience would be considerably enhanced. 


METHOD 


An Archimedes spiral of 34% turns (1260°) with an 
11}4-inch diameter served as the stimulus disc. It was 
mounted on a phonograph turntable which could be 
driven at various speeds and reversed by a switching 
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arrangement. The turntable was presented to Ss at a 
distance of 10 feet, perpendicular to the line of sight. 
The rotation speed was 78 RPM and fixation time was 
30 seconds. 

Large white balloons with spirals of black ink painted 
on the surface opposite the nozzles weze prepared. 
These balloons were used in the pre- and postfixation 
periods as follows: S viewed an inflating, deflating, and 
fixed balloon. Inflation was accomplished by means of 
compressed air. The hissing noise accompanying the 
inflation was also maintained during the deflating and 
fixed presentations. S was given a period of exploration 
of the equipment, and rapport was established. EZ then 
presented the stationary balloon spiral and said, “See 
this desigr? It’s called a spiral. See how it goes round 
and round? Can you say spiral? I have some balloons 
with a spiral on each one. I’m going to show you one 
balloon at a time. I want you to tell me each time if the 
spiral is getting bigger, or if the spiral is getting smaller, 
or if the spiral is staying the same size.” S was then 
seated directly in front of the mounted disc spiral 
which was covered to prevent distraction during the 
initial procedure. The directions, “Remember, each 
time, tell me if the spiral is getting bigger, or if it is 
getting smaller, or if it is staying the same size,”’ were 
repeated. The compressed air was released into a de- 
flated balloon held in the position of the mounted spiral 
and the instructions, “Tell me about the spiral now,” 
were repeated several times if necessary. The air was 
then slowly released from the balloon and the instruc- 
tions repeated, ““Tell me about the spiral now.” Finally, 
the stationary balloon was presented with the same 
instructions. 

The mounted spiral was then set in motion and un- 
covered with the following instructions: “Now I want 
you to look at this spiral. See this point right here 
(E points)? Look right at this point until I tell you to 
stop. Don’t take your eyes off the point.” (Continual 
encouragement was given to keep looking at the center 
point.) At the end of the fixation period, the stationary 
balloon with spiral was quickly placed in front of the 
mounted spiral (obscuring it) with the instructions 
“Tell me about ¢his spiral now.” 

Each S was given four trials in the order AB BA 
or BA AB. A toy was given between trials to allow a 
time lapse in order to avoid perseveration and to main- 
tain interest. S passed if his report was correct on three 
of the four trials. 

Thus, the balloons with spirals painted upon them 
served in the prefixation period to elicit verbal desig- 
nators which accompany an actual approaching and 
expanding or contracting and receding spiral, and in 
the postfixation period the stable balloon was used to 
elicit the illusory aftereffect response. The rotating 
spiral itself was employed only for fixation purposes. 

The Ss were 23 children with a chronological age 
range of 38 to 63 months, mean CA = 49.65 months, 
SD = 7.55 months. Mental ages ranged from 42 to 88 
months, mean MA = 63.13 months, SD = 13.77 
months. Ss were drawn from the nursery school of the 
Institute of Child Welfare, University of Minnesota. 


RESULTS AND DISCUSSION 


Seventeen of the twenty-three Ss achieved the 
success criterion (at least three out of four correct 
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responses) by utilizing the correct verbal designa- 
tors when they viewed the spiral painted on the 
unmoving balloon, after they had fixated the 
rotating disc. In the Harding, Glassman, and 
Helz study there were 40 Ss in the CA range 46 
through 63 months of whom 26 failed to achieve 
the success criterion. In the present study, only 
one of 17 Ss in the same age range failed. 

Similarly, in the Harding, Glassman, and Helz 
study, there were 29 Ss in the MA range 52 
through 84 months. Twelve of these Ss failed. In 
the present study, only two of 15 Ss in this MA 
range failed. 

The youngest S in the earlier study to achieve 
success was CA 55 months. In the present investi- 
gation, the youngest successful S was CA 45 
months. The lowest MA at which passing occurred 
in the earlier study was 61 months. In the present 
study the lowest passing MA was 48 months. 

It is particularly noteworthy that five of the six 
Ss who failed in the present study were also unable 
to respond correctly in the pretest situation, 
where they were required to respond to actual 
rather than illusory changes. It may be that failure 
to report illusory changes among some Ss is based 
on the unavailability of the necessary verbal 
designators rather than upon perceptual deficit. 

The balloon technique permits determination of 
verbal report availability prior to fixation of the 
rotating spiral, and the posttest eiicitation of re- 
sponse by means of a balloon which remains stable 
with regard to size and motion appears to evoke 
correct responses from children younger in both 
CA and MA than previous results suggest. 


SUMMARY 


Twenty-three children were tested for spiral 
aftereffect under a method designed to obtain 
their responses under actual as well as illusory 
conditions. It was found that virtually all Ss 
who responded correctly under actual conditions 
were able to report correctly under illusory condi- 
tions. Ss considerably younger in CA and lower 
in MA were able to achieve success in the task 
under present conditions than were able to achieve 
success in a previously reported investigation. 
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SOCIAL DESIRABILITY RATINGS OF PERSONALITY VARIABLES BY 
NORWEGIAN AND AMERICAN COLLEGE STUDENTS 


O. IVAR LOVAAS! 


University of Washington 


an purpose of this study was to find out the 
degree of relationship between social de- 
sirability judgments made by Norwegian and 
American college students on a set of statements 
about behaviors that a person might display. It 
has been found in previous research (4) that high 
school students from different socioeconomic 
classes show high agreement in their social de- 
sirability judgments of the personality statements 
investigated in this study. A similar high relation- 
ship was found between the judgments of American 
and Japanese-American college students (3). These 
correlations are all in the nineties. The scaling pro- 
cedure in these studies was the same as in the 
study here reported. 


The statements in Edwards’ Personal Prefer- 


ence Schedule (PPS) were used. This schedule 
consists of 135 statements about behavior grouped 
into 15 personality variables, 9 statements defining 
each variabie. These variables are described as 
manifest needs by Murray (5) and are labelled 


Achievement, Deference, Order, Exhibition, Au- 
tonomy, Affiliation, Intraception, Succorance, 
Dominance, Abasement, Nurturance, Change, En- 
durance, Sex, and Aggression. For a more com- 
plete description of this inventory, see Edwards (2). 

The Norwegian translation. of these statements 
was checked by faculty members of the Scan- 
dinavian Department of the University of Wash- 
ington and of the Psychology Department of the 
University of Oslo. The translated list of state- 
ments was presented to senior classes at two 
gymnasia in Oslo in the fall of 1954. The Nor- 
wegian sample consisted of 86 Ss, with a mean age 
of 17 years. The American sample (see Edwards 
[1]) contained 152 college students, with a mean 
age somewhat higher than the Norwegian sample.? 
The statements were presented one at a time and 
the Ss rated the statements on a nine-point scale 
ranging from extremely socially desirable to ex- 
tremely socially undesirable. The method of suc- 
cessive intervals was used to obtain the scale values 
of the items. 

The product-moment correlation between the 
scale values of the Norwegian and the American 


‘The author expresses appreciation to Allen L. 
Edwards of the University of Washington and Per 
Saugstad of the University of Oslo for their help in 
this study 

2Klett (4) found that there were no significant 
differences in social desirability, ratings at these age 
levels 
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sample was .78. This correlation, which indicates 
a high agreement with 61 per cent of the variance 
being common, is probably an underestimate of 
the true relationship, since errors in translation 
would attenuate the obtained value. 

Thus, it is possible that the remaining 39 per 
cent of the variance is made up to an unknown 
degree of error variance due to changes in the 
statements through translation. However, the 
assumption was made that chances are 50-50 of 
one group rating a statement higher or lower than 
the other group in terms of social desirability. 
Working on this assumption, consistent differences 
were found between the two groups in their rating 
of statements pertaining to 4 of the 15 variables. 
(This difference, significant at the .05 level, was 
evaluated in terms of the sign test for paired ob- 
servations and the binomial expansion, where p = 
.5,q = .5, and m = 9.) Americans rated the state- 
ments pertaining to Order, Intraception and Abase- 
ment as more socially desirable than did the Nor- 
wegians. The Norwegians, on the other hand, 
rated the statements pertaining to Aggression as 
more socially desirable than did the Americans. 

The personality variables employed in this 
study have the advantage of being clearly defined 
as specific behavior is described by each statement. 
However, specificity limits one’s information to 
only a few facets of a trait. A trait like aggression 
has many ways of expressing itself in behavior, and 
only nine rather overt expressions of this trait were 
employed in this study. Thus the difference on Ag- 
gression might disappear or reverse itself if the two 
groups had rated more subtle expressions of this 
trait. 

The author feels that the paper-and-pencil test 
approach, even with a translated version of the 
test, is feasible for cross-cultural studies—provided 
the test points to specific observable behaviors, as 
does the PPS. An approach like this throws some 
light on the similarities and differences of be- 
haviors that are acceptable in the two cultures. 
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RELATIONSHIP OF MANIFEST ANXIETY TO STIMULUS GENERALIZATION? 


ROBERT E. FAGER anp IRWIN J. KNOPF 
State University of Towa* 


M* INVESTIGATIONS have been concerned 
with relationships between the Taylor 
Manifest Anxiety Scale (MAS) and a variety of 
learning phenomena. In the area of stimulus 
generalization (SG), Rosenbaum (4) found greater 
responsiveness to generalized stimuli in a spatial 
situation for high MAS college students than for 
a low MAS group, but only when the Ss were 
given strong intermittent shock during their 
performance. Using a temporal generalization 


paradigm, Wenar (6), however, reported greater 
responsiveness for high MAS group for weak 
shock and buzzer conditions. Buss (2) trained 
psychiatric patients to make a verbal response to 


wooden blocks of a given height. Controlling for 
directional effects and measuring generalization 
to blocks of graded heights, he found no difference 
between high and low MAS groups in their gener- 
alization gradients. 

In view of these somewhat inconsistent findings 
and the suggestion that situational factors may 
account for differences in MAS results (5), the 
present study was designed to examine the rela- 
tionship between different levels of anxiety and 
stimulus generalization in psychiatric patients in 
a situation different from Buss’s. 


METHOD 


Subjects. One hundred sixty-nine patients, repre- 
senting a wide variety of psychiatric conditions, were 
given the Taylor Manifest Anxiety Scale. All were at 
least 16 years old, were not on any drugs or physical 
therapy, and scored less than 8 on the L scale of the 
MMPI. They were about evenly distributed by sex and 
had a median age of 31.2 and an educational level of 
12.4 years. Those in the highest (scores of 35 and adove) 
and lowest (17 and below) fifths of the MAS distribu- 
tion were selected as the high anxiety (HA) group 
(N = 34) and the low anxiety (LA) group (V = 31). 





1A modification of this paper was read at the Mid- 
western Psychological Association meetings in Chicago, 
1957. 

2? The assistance of Julia Weinberg and Jon Weinberg 
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3 Division of Psychology, Department of Psychiatry 
College of Medicine. 


Apparatus. The apparatus was similar to that de- 
veloped by Brown (1).‘ It consisted of seven stimulus 
lamps mounted in a horizontal row on a black plywood 
panel and spaced at 8-degree intervals. The panel was 
curved so that all lamps were 5 ft. from S, who was 
seated in front of a stand on which was mounted a 
spring-return toggle switch. When S moved the switch 
forward or backward, the words “‘win”’ or “lose’”’ would 
light up in the upper or lower /eft quadrants of a small 
box below the central lamp. The upper and lower right 
quadrants also contained the words “win” and “lose,” 
and were activated by E to give S information as to 
the actual outcome of each trial. 

Procedure. The instructions included the following 
features: S was told that each of the seven lamps 
represented a race horse. When a lamp was lighted, it 
indicated that this horse was running a race against 
other horses not represented on the stimulus panel. 
The seven horses on the panel were not competing 
against each other. Each time a light went on, it was 
a new race, and S was supposed to guess whether the 
horse would win or lose that race. If S thought the 
horse would win, he was to push the switch forward. 
If he thought it would lose, he was to pull the switch 
toward him. S was told to remember how well each 
horse did in order to increase the accuracy of his bets. 

Each trial (or race) consisted of the presentation of 
a stimulus light by E. S’s response automatically ter- 
minated the stimulus, at which time E activated the 
appropriate light to indicate the actual outcome of 
the race. Unknown to S, a predetermined schedule of 
80 per cent win for the central lamp but only 20 per 
cent win for the other six lamps was used. There were 
210 randomly ordered races. The response measure 
was the frequency of win responses to each stimulus 
lamp. 

A Lindquist Type I trend analysis (3) was used to 
test the difference in the shape of the gradients elicited 
from the high and low MAS groups. 


RESULTS AND CONCLUSIONS 


The data for all of the psychiatric Ss (V = 169) 
indicate that the frequency of win responses 
decreases progressively on each side of the central 
lamp as the angle separating a given lamp from 


‘The authors wish to express their appreciation to 
Judson S. Brown for his consultation and suggestions 
concerning the apparatus. 
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the center increases. The difference in gradients 
among the six peripheral lights was significant 
beyond the .001 level. Presumably, the response 
tendency to bet “win” to the central stimulus 
generalized to the peripheral stimuli with those 


closest to the center most affected. This result 
with psychiatric Ss is in keeping with Brown’s 
findings with college students. 

Figure 1 presents the percentage of win re- 
sponses plotted for the high and low MAS groups. 


The results show no evidence of differences and 
corroborate Buss’s findings. 

In the light of the studies cited, these results 
suggest that there is no relationship between MAS 
and stimulus generalization in psychiatric Ss. 
Moreover, situational factors do not seem impor- 
tant in limiting the generality of such an interpre- 
tation. Since the relationships between anxiety 
and learning phenomena are generally well recog- 
nized, these negative results presumably reflect 
the inadequacy of the Taylor scale as a relevant 
index of anxiety levels in psychiatric subjects. 
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